[mpich-discuss] [cli_0]: aborting job:

Sangamesh B forum.san at gmail.com
Wed Sep 3 23:51:29 CDT 2008


Hi All,

   I've compiled a home developed C application, with MPICH2-1.0.7, GNU
compilers on Cent OS 5 based  Rocks 5 cluster.

Command used and error are as follows:

$ /opt/mpich2/gnu/bin/mpirun -machinefile ./mach28 -np 8 ./run3 ./run3.in |
tee run3_1a_8p

[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
rank 0 in job 1  locuzcluster.org_44326   caused collective abort of all
ranks
  exit status of rank 0: killed by signal 9

$ ldd run3
        libm.so.6 => /lib64/libm.so.6 (0x0000003a1fa00000)
        libmpich.so.1.1 => /opt/mpich2/gnu/lib/libmpich.so.1.1
(0x00002aaaaaac4000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a20200000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003a20e00000)
        libuuid.so.1 => /lib64/libuuid.so.1 (0x00002aaaaadba000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003a1f600000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003a1f200000)

It is recommended to run this job for 48 and 96 process/cores. But cluster
has only 8 cores.
Is this lower no of processes causing the above error?

Thank you,
Sangamesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080904/a2c937f1/attachment.htm>


More information about the mpich-discuss mailing list