[petsc-dev] Generality of VecScatter

Dmitry Karpeev karpeev at mcs.anl.gov
Tue Nov 29 12:53:26 CST 2011


I understand that any solution requires serialization: atomic reads by the
spinning loop would serialize on that read,
while pthread_cond_wait requires mutex serialization.  sigwait serializes
since it clears the signal atomically.
Ultimately, all three rely on atomicity of some sort, but mutexes
apparently have higher overhead for it?
At least the stackoverflow page that suggests the sigwait solution reports
a 40x improvement over the pthread_cond_wait
solution (admittedly, I don't know the details of the sigwait stuff).

Dmitry.

On Tue, Nov 29, 2011 at 12:40 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Nov 29, 2011, at 10:41 AM, Jed Brown wrote:
>
> > On Tue, Nov 29, 2011 at 10:35, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > Synchronization between threads frequently involves the use of spin-wait
> loops. These loops should make use of the PAUSE instruction to maximize
> performance and minimize power consumption. The PAUSE instruction can be
> added to application code now, as it is ignored on all known existing Intel
> architectures.
> >
> > Where are going getting this from?
>
>   The very first google hit for "pause instruction spin loop"; looks like
> they have the same web maintenance issues PETSc has :-)
>
>
> >
> >
> > So for Q all I need do is
> >
> > #define PAUSE
> >
> > and then use PAUSE in the spin-wait to get similar performance to the
> Intel?
> >
> >
> http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/compiler_c/intref_cls/common/intref_sse2_pause.htm
> >
> >
> http://software.intel.com/en-us/articles/introduction-to-hyper-threading-technology/
> >
> > Use the PAUSE Instruction to Optimize Code
> > Intel recommends that software developers always use the PAUSE
> instruction in spin-wait loops.
> >
> > Starting with the Intel® Pentium® 4 processor, this recommendation was
> made for the benefit of power savings. Executing a PAUSE in a spin-wait
> loop forces the spin-wait to finish one iteration of the loop before
> starting the next iteration. Halting the spin-wait loop reduces consumption
> of non-productive resources - and thus power.
> >
> > With a Hyper-Threading Technology supported processor, using the PAUSE
> instruction optimizes processing of two threads. Pausing the spin-wait loop
> frees more resources for the other processor, allowing faster completion of
> its thread.
> >
> > Use OS Synchronization Techniques on Long Waits
> > When a thread suspects a thread will take longer than an OS quantum of
> time before a lock is released, use OS synchronization techniques to idle
> the processor until the lock is released. Idling the Hyper-Threading
> Technology supported processor frees the locked processor to use all
> resources available to complete its execution. Use OS primitives to release
> the lock when thread execution has completed.
> >
> > If a thread suspects a thread will release a lock within an OS quantum,
> use a spin-wait.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111129/4fb76198/attachment.html>


More information about the petsc-dev mailing list