[mpich-discuss] thread MPI calls
Pavan Balaji
balaji at mcs.anl.gov
Thu Jul 30 01:57:39 CDT 2009
If a process is idle for a long time, the Wait call with keep calling
poll() to check if anything has arrived. For example, try this experiment:
Process 0:
MPI_Irecv();
MPI_Wait();
Process 1:
sleep(10);
MPI_Send();
Process 0 will see a lot of "system activity" as it's busily waiting for
data to arrive.
-- Pavan
On 07/29/2009 03:19 PM, chong tan wrote:
> I like to provide futher info on what we have experiment, hopefully this
> can of some use, even with my future competitor (they suscribe this
> email too).
>
> 1. We replace Wait_all for Wait_any, and there is no different in
> performance
> 2. We experimented with affining the recv thread to the same physcial
> CPU-same core
> pair, different core pair, and same core as main thread (core pair
> done on AMD boxes). .
> This experiment is done on :
> - Irecv versus Recv
> - wait_all and wait_any
> - early versus late wait_all, and wait_any, wait
> - wait in recv thread versus wait in main thread
> Amazigly, running the recv thread on the same core as main thread is
> the fastest one, it almost
> has the same performance as non-threaded implementation with some
> code combo.
>
> Early wait, both all and any, is the worst performer.
>
> Given that we know one partitcular test between more than 5% of time
> is spent in master
> proc (id ==0) completing the already sent data via MPI_Recv (from MPE
> and our monitor, and
> the fact that master proc got the biggest chunk of work in the test),
> we expect to see some positive
> sign of live using threaded MPI. or at least not the negative gain as
> we have experience.
>
>
> one issue we observed what that Wait*, is causing significant sys
> activities in other processes.
> Maybe this is the problem, maybe not.
>
> tan
>
> ------------------------------------------------------------------------
> *From:* Pavan Balaji <balaji at mcs.anl.gov>
> *To:* mpich-discuss at mcs.anl.gov
> *Sent:* Tuesday, July 28, 2009 7:24:49 PM
> *Subject:* Re: [mpich-discuss] thread MPI calls
>
>
> > we just completed 1 partiticular tests using SERIALIZED, and that
> make no difference (compared to
> > MULTIPLE).
>
> In that case, the problem is likely with the algorithm itself. In
> SERIALIZED mode, MPI does not add any locks and will not add any
> additional overhead. But it looks like your algorithm is blocking
> waiting for data from all slave processes before proceeding to the next
> iteration -- this will cause a lot of idle time. Is there someway you
> can optimize your algorithm?
>
> -- Pavan
>
> -- Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list