[MPICH] Error handling issue
Jayesh Krishna
jayesh at mcs.anl.gov
Mon Nov 12 10:29:52 CST 2007
Hi,
This could probably be an error message given by the process manager.
How are you aborting the process?
Regards,
Jayesh
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of AGPX
Sent: Sunday, November 11, 2007 6:37 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] Error handling issue
Hi,
I have write the following code wishing to avoid my main process to abort on
an MPI error:
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &MPIId);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
but when I try to terminate a job process on another machine (pcamd3000 is
the main machine, pcamd2600 the other. I use Windows XP Pro on both), then
the main process abort. Here the error message:
job aborted:
rank: node: exit code[: error message]
0: pcamd3000: 1: Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(173).............................: MPI_Send(buf=00B458B0, count=1,
MPI_
INT, dest=1, tag=0, comm=0x84000000) failed
MPIDI_CH3I_Progress(148)..................: handle_sock_op failed
MPIDI_CH3I_Progress_handle_sock_event(497):
MPIDU_Sock_wait(2603).....................: Il nome di rete specificato non
è più disponibile. (errno 64)
1: pcamd2600: 1: process 1 exited without calling finalize
2: pcamd2600: 1
(note that the message: 'Il nome di rete specificato non è più
disponibile.' in english is: 'The network name specified is no more
available'.)
What I miss? I have more than one communicator, but I have used
MPI_Comm_set_errhandler as well to set their error handler to
MPI_ERRORS_RETURN. The code is:
...
MPI_Group_incl(worldGroup, nRanks, ranks, &handle.group);
MPI_Comm_create(MPI_COMM_WORLD, handle.group, &handle.comm);
MPI_Comm_set_errhandler(handle.comm, MPI_ERRORS_RETURN);
...
I have also tried with MPI_Errhandler_set, but this doesn't help:
MPI_Errhandler_set(..., MPI_ERRORS_RETURN);
Any suggestion?
Thanks,
- AGPX
_____
_____
L'email della prossima generazione? Puoi averla con la nuova
<http://us.rd.yahoo.com/mail/it/taglines/hotmail/nowyoucan/nextgen/*http://i
t.docs.yahoo.com/nowyoucan.html> Yahoo! Mail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20071112/a3ecddd4/attachment.htm>
More information about the mpich-discuss
mailing list