[petsc-dev] Generality of VecScatter

Barry Smith bsmith at mcs.anl.gov
Tue Nov 29 12:40:40 CST 2011


On Nov 29, 2011, at 10:41 AM, Jed Brown wrote:

> On Tue, Nov 29, 2011 at 10:35, Barry Smith <bsmith at mcs.anl.gov> wrote:
> Synchronization between threads frequently involves the use of spin-wait loops. These loops should make use of the PAUSE instruction to maximize performance and minimize power consumption. The PAUSE instruction can be added to application code now, as it is ignored on all known existing Intel architectures.
> 
> Where are going getting this from?

  The very first google hit for "pause instruction spin loop"; looks like they have the same web maintenance issues PETSc has :-)


>  
> 
> So for Q all I need do is
> 
> #define PAUSE
> 
> and then use PAUSE in the spin-wait to get similar performance to the Intel?
> 
> http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/compiler_c/intref_cls/common/intref_sse2_pause.htm
> 
> http://software.intel.com/en-us/articles/introduction-to-hyper-threading-technology/
> 
> Use the PAUSE Instruction to Optimize Code
> Intel recommends that software developers always use the PAUSE instruction in spin-wait loops.
> 
> Starting with the Intel® Pentium® 4 processor, this recommendation was made for the benefit of power savings. Executing a PAUSE in a spin-wait loop forces the spin-wait to finish one iteration of the loop before starting the next iteration. Halting the spin-wait loop reduces consumption of non-productive resources - and thus power.
> 
> With a Hyper-Threading Technology supported processor, using the PAUSE instruction optimizes processing of two threads. Pausing the spin-wait loop frees more resources for the other processor, allowing faster completion of its thread.
> 
> Use OS Synchronization Techniques on Long Waits
> When a thread suspects a thread will take longer than an OS quantum of time before a lock is released, use OS synchronization techniques to idle the processor until the lock is released. Idling the Hyper-Threading Technology supported processor frees the locked processor to use all resources available to complete its execution. Use OS primitives to release the lock when thread execution has completed.
> 
> If a thread suspects a thread will release a lock within an OS quantum, use a spin-wait.




More information about the petsc-dev mailing list