<div dir="ltr">Hi All,<br><br> I've compiled a home developed C application, with MPICH2-1.0.7, GNU compilers on Cent OS 5 based Rocks 5 cluster.<br><br>Command used and error are as follows:<br><br>$ /opt/mpich2/gnu/bin/mpirun -machinefile ./mach28 -np 8 ./run3 ./<a href="http://run3.in" target="_blank">run3.in</a> | tee run3_1a_8p<br>
<br>[cli_0]: aborting job:<br>application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0<br>rank 0 in job 1 locuzcluster.org_44326 caused collective abort of all ranks<br> exit status of rank 0: killed by signal 9<br>
<br>$ ldd run3<br> libm.so.6 => /lib64/libm.so.6 (0x0000003a1fa00000)<br> libmpich.so.1.1 => /opt/mpich2/gnu/lib/libmpich.so.1.1 (0x00002aaaaaac4000)<br> libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a20200000)<br>
librt.so.1 => /lib64/librt.so.1 (0x0000003a20e00000)<br> libuuid.so.1 => /lib64/libuuid.so.1 (0x00002aaaaadba000)<br> libc.so.6 => /lib64/libc.so.6 (0x0000003a1f600000)<br> /lib64/ld-linux-x86-64.so.2 (0x0000003a1f200000)<br>
<br>It is recommended to run this job for 48 and 96 process/cores. But cluster has only 8 cores.<br>Is this lower no of processes causing the above error?<br><br>Thank you,<br>
Sangamesh</div>