[petsc-dev] Question about MPICH device we use

Jeff Hammond jeff.science at gmail.com
Sun Jul 26 19:16:41 CDT 2020


On Thu, Jul 23, 2020 at 9:35 PM Satish Balay <balay at mcs.anl.gov> wrote:

> On Thu, 23 Jul 2020, Jeff Hammond wrote:
>
> > Open-MPI refuses to let users over subscribe without an extra flag to
> > mpirun.
>
> Yes - and when using this flag - it lets the run through - but there is
> still performance degradation in oversubscribe mode.
>
> > I think Intel MPI has an option for blocking poll that supports
> > oversubscription “nicely”.
>
> What option is this? Is it compile time option or something for mpiexec?
>

https://software.intel.com/content/www/us/en/develop/articles/tuning-the-intel-mpi-library-advanced-techniques.html

Apply wait mode to oversubscribed jobs

This option is particularly relevant for oversubscribed MPI jobs. The goal
is to enable the wait mode of the progress engine in order to wait for
messages without polling the fabric(s). This can save CPU cycles but
decreases the message-response rate (latency), so it should be used with
caution. To enable wait mode simply use:

I_MPI_WAIT_MODE=1


Jeff


> Satish
>
> > MPICH might have a “no local” option that
> > disables shared memory, in which case nemesis over libfabric with the
> > sockets or TCP provider _might_ do the right thing. But you should ask
> > MPICH people for details.
> >
> > Jeff
> >
> > On Thu, Jul 23, 2020 at 12:40 PM Jed Brown <jed at jedbrown.org> wrote:
> >
> > > I think we should default to ch3:nemesis when --download-mpich, and
> only
> > > do ch3:sock when requested (which we would do in CI).
> > >
> > > Satish Balay via petsc-dev <petsc-dev at mcs.anl.gov> writes:
> > >
> > > > Primarily because ch3:sock performance does not degrade in
> oversubscribe
> > > mode - which is developer friendly - i.e on your laptop.
> > > >
> > > > And folks doing optimized runs should use a properly tuned MPI for
> their
> > > setup anyway.
> > > >
> > > > In this case --download-mpich-device=ch3:nemesis is likely
> appropriate
> > > if using --download-mpich [and not using a separate/optimized MPI]
> > > >
> > > > Having defaults that satisfy all use cases is not practical.
> > > >
> > > > Satish
> > > >
> > > > On Wed, 22 Jul 2020, Matthew Knepley wrote:
> > > >
> > > >> We default to ch3:sock. Scott MacLachlan just had a long thread on
> the
> > > >> Firedrake list where it ended up that reconfiguring using
> ch3:nemesis
> > > had a
> > > >> 2x performance boost on his 16-core proc, and noticeable effect on
> the 4
> > > >> core speedup.
> > > >>
> > > >> Why do we default to sock?
> > > >>
> > > >>   Thanks,
> > > >>
> > > >>      Matt
> > > >>
> > > >>
> > >
> >
>
-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200726/235106de/attachment.html>


More information about the petsc-dev mailing list