[mpich-discuss] MPI error, error stack

wzlu wzlu at gate.sinica.edu.tw
Wed Jul 2 03:58:28 CDT 2008


Hi, all

I used mpich 2 to run my job. And I got following error message.
I have test cpi without any error message.
The error cause by network? or other? Thanks a lot.

Best Regards,
Lu

[cli_15]: aborting job:
Fatal error in MPI_Waitall: Other MPI error, error stack:
MPI_Waitall(242)..........................: MPI_Waitall(count=10,
req_array=0x11e9a90, status_array=0x11e9990) failed
MPIDI_CH3_Progress_wait(212)..............: an error occurred while
handling an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(413):
MPIDU_Socki_handle_read(633)..............: connection failure
(set=0,sock=14,errno=104:Connection reset by peer)

cpu real user sys ratio node
0* 0.40 0.01 0.01 6% in04035.pcf.sinica.edu.tw
total 0.40 0.01 0.01 0.06x

memory local global res size pag flts pag flts voluntary involunt
heap heap (pages) minor major switches switches
0* 3MB 1KB 0 2135 18 854 5
total 3MB 1KB 0 2135 18 854 5

messages send send send recv recv recv copy copy copy
cnt total avg cnt total avg cnt total avg
0* 0 0 B 0 B 0 0 B 0 B 0 0 B 0 B
total 0 0 B 0 B 0 0 B 0 B 0 0 B 0 B
rank 18 in job 1 in04033.pcf.sinica.edu.tw_53415 caused collective abort
of all ranks
exit status of rank 18: killed by signal 8
rank 15 in job 1 in04033.pcf.sinica.edu.tw_53415 caused collective abort
of all ranks
exit status of rank 15: return code 1
[cli_13]: aborting job:
Fatal error in MPI_Waitall: Other MPI error, error stack:
MPI_Waitall(242)..........................: MPI_Waitall(count=6,
req_array=0x11e9a40, status_array=0x11e9990) failed
MPIDI_CH3_Progress_wait(212)..............: an error occurred while
handling an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(413):
MPIDU_Socki_handle_read(633)..............: connection failure
(set=0,sock=7,errno=104:Connection reset by peer)

cpu real user sys ratio node
0* 0.40 0.01 0.03 9% in04037.pcf.sinica.edu.tw
total 0.40 0.01 0.03 0.09x

memory local global res size pag flts pag flts voluntary involunt
heap heap (pages) minor major switches switches
0* 3MB 1KB 0 2021 19 846 6
total 3MB 1KB 0 2021 19 846 6

messages send send send recv recv recv copy copy copy
cnt total avg cnt total avg cnt total avg
0* 0 0 B 0 B 0 0 B 0 B 0 0 B 0 B
total 0 0 B 0 B 0 0 B 0 B 0 0 B 0 B




More information about the mpich-discuss mailing list