[mpich-discuss] MPI PBS Error
Rajeev Thakur
thakur at mcs.anl.gov
Sat Mar 24 19:48:24 CDT 2012
This error is from Open MPI not MPICH2.
Rajeev
On Mar 24, 2012, at 1:17 PM, Bharath Pattabiraman wrote:
> Hi,
>
> I am getting the following error with my application when I run it on 64 nodes (proc per node).
>
> =>> PBS: job killed: node 20 (qnode0553) requested job terminate, 'EOF' (code 1099) - received SIST
> ER_EOF attempting to communicate with sister MOM's
> mpirun: killing job...
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 26 in communicator MPI_COMM_WORLD
> with errorcode 15.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun was unable to cleanly terminate the daemons on the nodes shown
> below. Additional manual cleanup may be required - please refer to
> the "orte-clean" tool for assistance.
> --------------------------------------------------------------------------
> qnode0660
> qnode0638
> qnode0632
> qnode0630
> qnode0616
> qnode0592
> qnode0690
> qnode0519
> qnode0724
> qnode0544
> qnode0669
> qnode0522
> qnode0526
> qnode0527
> qnode0534
> qnode0537
> qnode0541
> qnode0543
> qnode0549
> qnode0553
> qnode0555
> qnode0559
> qnode0561
> qnode0566
> qnode0569
> qnode0570
> qnode0574
> qnode0578
> qnode0581
> qnode0587
> qnode0593
> qnode0595
> qnode0598
> qnode0602
> qnode0609
> qnode0618
> qnode0621
> qnode0622
> .
> .
> .
> .
> .
>
> Regards,
> Bharat
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list