[mpich-discuss] Strange behavior

Bellanca Gaetano gaetano.bellanca at unife.it
Fri Sep 26 19:04:54 CDT 2008


Hi,

I'm running a simulation on a Linux Box (ubuntu 8 with kernel 
2.6.24-21-generic).
It is a fortran program compiled with Intel 10.1 and mpich2-1.0.7.
The CPU is an AMD 64 X2 Dual Core 5600+

The simulation works correctly with a small data set, and I can use 
1, 2, 4 ... processes (mpiexec -n #) to emulate a cluster, but when 
the data set is increased, the simulation runs only if I use more 
than 4 or 6 processes (the number depends on the dimension of the 
data set; a bigger data set requires an increased number of processes).

It stops in a send-receive communication between all the PEs and PE0, 
and stops also if I use mpi_gatherv.
But, very strange, if I stop mpd (which starts at boot time) and 
restart it manually, the simulation works without any error with the 
same data set !
If I run the simulation in a linux cluster, I have the same behavior 
except that, also restarting mpd, the simulation doesn't work if I 
don't use enough processes.

Any idea?

Regards.

gaetano


----------
Gaetano Bellanca - Department of Engineering - University of Ferrara
Via Saragat, 1 - 44100 - Ferrara - ITALY
Voice (VoIP):  +39 0532 974809     Fax:  +39 0532 974870
mailto:gaetano.bellanca at unife.it

----------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080927/58034556/attachment.htm>


More information about the mpich-discuss mailing list