[MPICH] stuck in bcast

Rajeev Thakur thakur at mcs.anl.gov
Thu Oct 26 11:28:34 CDT 2006


Is there enough memory allocated for the buffer? If you can send us a small
test program that demonstrates the error, it would be useful.

Rajeev
 

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Martin 
> Kleinschmidt
> Sent: Thursday, October 26, 2006 8:20 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] stuck in bcast
> 
> Hi,
> 
> I'm having problems with my code. It hangs in broadcast:
> 
>       call MPI_bcast(ediag, nsaf,
>      $           MPI_double_precision, 0, MPI_Comm_World, MPIerr)
> 
> 
> when nsaf is large (see below).
> 
> symptom is:
> process 0 is using 100% cpu, all others are idle.
> process 0 cannot be killed not even with kill -9
> 
> in a loop, I increased nsaf and found that bcast goes well up to
> nsaf=1495039 but fails with nsaf=1495040 (which is 0x16D000 )
> 
> as far as i can see, this is not a hard limit in message size, because
> I am able do bcast approx. 45 million double complex (750 MB)
> successfully, whereas ediag is only 12 MB.
> 
> any ideas?
> 
> 
>    ...martin
> 
> 
> (I'm using mpich2-1.0.4p1, intel compiler 9.0, fedora core 2, mpd)
> 
> 




More information about the mpich-discuss mailing list