[mpich-discuss] mpich2 hangs on Ubuntu beowulf cluster(with NFS)

Gustavo Correa gus at ldeo.columbia.edu
Wed Jan 4 17:42:14 CST 2012


Hi Konstantinos

Not necessarily, but it is hard to tell not knowing the code.
'Unsafe' code may run on a few processors, but hang on more processors,
run when compiled with one MPI distribution, hang with another MPI, etc.
Very often it assumes that the system will provide more message buffers than it actually does.
There are very good and simple examples of deadlock and unsafe codes in Chapter 2 of:
MPI - The Complete Reference, volume 1, The MPI Core, 2nd edition, by M. Snir et al.
Examples 2.10, 2.13, are 'unsafe', and may or may not deadlock.

I am no expert, but if you post the code, or a simplified version of it,
somebody on the list may help.

Gus Correa

On Jan 4, 2012, at 6:12 PM, Konstantinos Varotsos wrote:

> Hi, Gustavo
> 
> 
> thanx for the fast reply.
> 
> 
> One question. If the code deadlocks
> 
> when using the two machines
> 
> shouldn it deadlocking when running the code on
> 
> each machine seperately
> 
> 
> 
> Kwstas
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list