[MPICH] MPI_Barrier on ch3:shm and sched_yield

Mon Feb 26 18:27:48 CST 2007

Thanks, I did consider that option too but some of the benchmarks I ran took
roughly 1.5x the time with sock instead of shm. I was wondering if I could
get the best of both (nice behavior of sock during waits vs. performance of
shm) somehow, given that I need the slow but well-behaved barrier only in
certain isolated places. I'm not highly familiar with the way the channels
work, but it looked like it would be highly non-trivial if not impossible to
interchange channels at runtime.

Regards,
Sudarshan

On 26/02/07, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
>
>  You could directly use the ch3:sock channel instead of ch3:shm. It would
> behave better in this case because it waits on a poll() as you point out,
> but the rest of the communication would be slower with ch3:sock than
> ch3:shm.
>
> Rajeev
>
>  ------------------------------
> *From:* Sudarshan Raghunathan [mailto:rdarshan at gmail.com]
> *Sent:* Monday, February 26, 2007 5:58 PM
> *To:* Rajeev Thakur
> *Cc:* mpich-discuss at mcs.anl.gov
> *Subject:* Re: [MPICH] MPI_Barrier on ch3:shm and sched_yield
>
> Rajeev,
>
> Thank you for the reply.
>
> I was running four MPI processes and then started another computationally
> intensive task (running another MPI program on 2 processes, for example).
> The other task gets around 50% of the CPU, but seems to be constantly
> competing with the first MPI program which isn't really doing any heavy
> lifting except waiting on the barrier.
>
> I looked into a few options, all of which seem to have problems:
> (a) put a sleep in the MPIDI_CH3I_Progress code so that the processes
> waiting on the barrier sleep for a while instead of spinning in a
> sched_yield loop and compete with other processes which might actually need
> to use the CPU. The problem with this approach is that it might slow down
> portions of the application where a fast barrier _is_ required
> (b) use semaphores. It looks like it might touch a lot of places in the
> implementation and again might cause performance problems where a fast
> barrier is needed
> (c) use sockets. The sock device in MPICH seems to be calling poll with a
> time-out and that seems to be behaving much better than the shm device. I'm
> not sure how much of the existing MPICH code can be reused.
>
> Any other saner/more feasible options or advice on which of the above
> three options is simplest to implement would be most appreciated.
>
> Thanks again,
> Sudarshan
>
> On 26/02/07, Rajeev Thakur <thakur at mcs.anl.gov > wrote:
> >
> >  Sudarshan,
> >                  How many MPI processes are you running on the 4-way SMP
> > and how many processes of the other application? Does the other application
> > get any CPU at all? sched_yield just gives up the current time slice for the
> > process and moves it to the end of the run queue. It will get run again when
> > its turn comes. In other words, the process doesn't sleep until some event
> > happens. It shares the CPU with other processes.
> >
> > Rajeev
> >
> >
>

-- 
When the impossible has been eliminated, whatever remains, however
improbable  must be the Truth
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070226/84de50d3/attachment.htm>