[petsc-dev] Question about MPICH device we use

Junchao Zhang junchao.zhang at gmail.com
Thu Jul 23 23:41:10 CDT 2020


On Thu, Jul 23, 2020 at 11:35 PM Satish Balay via petsc-dev <
petsc-dev at mcs.anl.gov> wrote:

> On Thu, 23 Jul 2020, Jeff Hammond wrote:
>
> > Open-MPI refuses to let users over subscribe without an extra flag to
> > mpirun.
>
> Yes - and when using this flag - it lets the run through - but there is
> still performance degradation in oversubscribe mode.
>
> > I think Intel MPI has an option for blocking poll that supports
> > oversubscription “nicely”.
>
> What option is this? Is it compile time option or something for mpiexec?
>
I only found configure time options,
  --enable-nemesis-dbg-nolocal, alias for --enable-dbg-nolocal
  --enable-dbg-nolocal    enables debugging mode where
shared-memory communication is disabled

Satish
>
> > MPICH might have a “no local” option that
> > disables shared memory, in which case nemesis over libfabric with the
> > sockets or TCP provider _might_ do the right thing. But you should ask
> > MPICH people for details.
> >
> > Jeff
> >
> > On Thu, Jul 23, 2020 at 12:40 PM Jed Brown <jed at jedbrown.org> wrote:
> >
> > > I think we should default to ch3:nemesis when --download-mpich, and
> only
> > > do ch3:sock when requested (which we would do in CI).
> > >
> > > Satish Balay via petsc-dev <petsc-dev at mcs.anl.gov> writes:
> > >
> > > > Primarily because ch3:sock performance does not degrade in
> oversubscribe
> > > mode - which is developer friendly - i.e on your laptop.
> > > >
> > > > And folks doing optimized runs should use a properly tuned MPI for
> their
> > > setup anyway.
> > > >
> > > > In this case --download-mpich-device=ch3:nemesis is likely
> appropriate
> > > if using --download-mpich [and not using a separate/optimized MPI]
> > > >
> > > > Having defaults that satisfy all use cases is not practical.
> > > >
> > > > Satish
> > > >
> > > > On Wed, 22 Jul 2020, Matthew Knepley wrote:
> > > >
> > > >> We default to ch3:sock. Scott MacLachlan just had a long thread on
> the
> > > >> Firedrake list where it ended up that reconfiguring using
> ch3:nemesis
> > > had a
> > > >> 2x performance boost on his 16-core proc, and noticeable effect on
> the 4
> > > >> core speedup.
> > > >>
> > > >> Why do we default to sock?
> > > >>
> > > >>   Thanks,
> > > >>
> > > >>      Matt
> > > >>
> > > >>
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200723/4a40e50e/attachment.html>


More information about the petsc-dev mailing list