[MPICH] stuck in bcast
Rajeev Thakur
thakur at mcs.anl.gov
Thu Oct 26 11:28:34 CDT 2006
Is there enough memory allocated for the buffer? If you can send us a small
test program that demonstrates the error, it would be useful.
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Martin
> Kleinschmidt
> Sent: Thursday, October 26, 2006 8:20 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] stuck in bcast
>
> Hi,
>
> I'm having problems with my code. It hangs in broadcast:
>
> call MPI_bcast(ediag, nsaf,
> $ MPI_double_precision, 0, MPI_Comm_World, MPIerr)
>
>
> when nsaf is large (see below).
>
> symptom is:
> process 0 is using 100% cpu, all others are idle.
> process 0 cannot be killed not even with kill -9
>
> in a loop, I increased nsaf and found that bcast goes well up to
> nsaf=1495039 but fails with nsaf=1495040 (which is 0x16D000 )
>
> as far as i can see, this is not a hard limit in message size, because
> I am able do bcast approx. 45 million double complex (750 MB)
> successfully, whereas ediag is only 12 MB.
>
> any ideas?
>
>
> ...martin
>
>
> (I'm using mpich2-1.0.4p1, intel compiler 9.0, fedora core 2, mpd)
>
>
More information about the mpich-discuss
mailing list