[petsc-dev] Generality of VecScatter

Tue Nov 29 10:14:32 CST 2011

On Nov 29, 2011, at 9:08 AM, Jed Brown wrote:

> On Tue, Nov 29, 2011 at 08:52, Dmitry Karpeev <karpeev at mcs.anl.gov> wrote:
> >From what I understand Barry doesn't want the threads to spin.
> 
> Lots of MPI calls spin because it's MUCH lower latency. Unless we have more threads than cores, what is the problem?
>  
> Also, synchronizing through an unguarded memory location seems to create a race conditions.
> 
> Not if writes are atomic. There is always a way to do atomic writes (usually machine-word) because otherwise the operating system could not implement synchronization primitives.
> 
> Isn't cmpxchg instruction-set specific?
> 
> All instruction sets have some analogue of cmpxchg because it's the building block for all other primitives.

  Shri,

  I think we need to investigate both the method Dmitry suggested and what Jed suggests. 

  Barry

Generally I think of sigs as being pretty slow so I am surprised by the result on the website but hey

In the world Intel is pushing us toward they want us to have SEVERAL threads per core (and IBM BlueGene Q also I think) how is the spinning going to work in that situation.