[MPICH] Confused about compiling/using mpich2

Gaetano Bellanca gaetano.bellanca at unife.it
Sat Jan 5 06:37:56 CST 2008


Hi,
I'm developing a parallel code using MPICH2 and I'm confused on how 
to choose the compiler options and how to use mpich2 on clusters of 
multicore processors.

My code runs on a 32 bit, single core linux cluster  of 10PCs 
(pentium IV) without problems (with good performances on a pvfs2 
filesystem), but does not run on an AMD 64 X2 Dual Core Processor 
5600+ and on a Dual Processor Quad-core Intel Xeon linux box.

In particular, the program stops during simple MPI_SEND and MPI_RECv 
operations when I run it with mpiexec -n 2 command with the message 
'rank 1 in job 70 hostname_number caused collective abort of all 
ranks. exit status of rank 1: killed by signal 9'

When I boot in Windows and run the same code using the Windows 
version of mpich2, it works regularly without problems.

By reading the mpich2 guide, I see that there are different possible 
choices, for example, for the communication device, but I didn't see 
big differences by changing it: the code always stops with similar errors.

Can someone please suggest me the better compiler options for mpich2 
compilation/use in Linux environment? I'm using the Intel Fortran 
10.1  compiler, and I'd like to use both the multicore and the 
multiprocessor facilities of the two different architectures (not 
mixing them, I mean using AMD and Intel separately).

Thanks in advance.

Gaetano



----------
Gaetano Bellanca - Department of Engineering - University of Ferrara
Via Saragat, 1 - 44100 - Ferrara - ITALY
Voice (VoIP):  +39 0532 974809     Fax:  +39 0532 974870
mailto:gaetano.bellanca at unife.it

----------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080105/9e957773/attachment.htm>


More information about the mpich-discuss mailing list