[petsc-dev] Question about MPICH device we use

Jed Brown jed at jedbrown.org
Sun Jul 26 20:43:09 CDT 2020


Jeff Hammond <jeff.science at gmail.com> writes:

> On Thu, Jul 23, 2020 at 9:35 PM Satish Balay <balay at mcs.anl.gov> wrote:
>
>> On Thu, 23 Jul 2020, Jeff Hammond wrote:
>>
>> > Open-MPI refuses to let users over subscribe without an extra flag to
>> > mpirun.
>>
>> Yes - and when using this flag - it lets the run through - but there is
>> still performance degradation in oversubscribe mode.
>>
>> > I think Intel MPI has an option for blocking poll that supports
>> > oversubscription “nicely”.
>>
>> What option is this? Is it compile time option or something for mpiexec?
>>
>
> https://software.intel.com/content/www/us/en/develop/articles/tuning-the-intel-mpi-library-advanced-techniques.html
>
> Apply wait mode to oversubscribed jobs
>
> This option is particularly relevant for oversubscribed MPI jobs. The goal
> is to enable the wait mode of the progress engine in order to wait for
> messages without polling the fabric(s). This can save CPU cycles but
> decreases the message-response rate (latency), so it should be used with
> caution. To enable wait mode simply use:
>
> I_MPI_WAIT_MODE=1

Has anyone tested ch4:ucx?

$ rg UCX_PERF_WAIT_MODE
src/mpid/ch4/netmod/ucx/ucx/test/gtest/common/test_perf.cc
190:    params.wait_mode       = UCX_PERF_WAIT_MODE_LAST;

src/mpid/ch4/netmod/ucx/ucx/src/tools/perf/perftest.c
513:    params->wait_mode         = UCX_PERF_WAIT_MODE_LAST;

src/mpid/ch4/netmod/ucx/ucx/src/tools/perf/api/libperf.h
70:    UCX_PERF_WAIT_MODE_PROGRESS,     /* Repeatedly call progress */
71:    UCX_PERF_WAIT_MODE_SLEEP,        /* Go to sleep */
72:    UCX_PERF_WAIT_MODE_SPIN,         /* Spin without calling progress */
73:    UCX_PERF_WAIT_MODE_LAST

modules/ucx/test/gtest/common/test_perf.cc
189:    params.wait_mode       = UCX_PERF_WAIT_MODE_LAST;

modules/ucx/src/tools/perf/perftest.c
553:    params->wait_mode         = UCX_PERF_WAIT_MODE_LAST;

modules/ucx/src/tools/perf/api/libperf.h
71:    UCX_PERF_WAIT_MODE_PROGRESS,     /* Repeatedly call progress */
72:    UCX_PERF_WAIT_MODE_SLEEP,        /* Go to sleep */
73:    UCX_PERF_WAIT_MODE_SPIN,         /* Spin without calling progress */
74:    UCX_PERF_WAIT_MODE_LAST


More information about the petsc-dev mailing list