[MPICH] Program running with mpich2 stops
Rajeev Thakur
thakur at mcs.anl.gov
Mon Sep 24 09:46:35 CDT 2007
It looks like one process may have died for some reason (such as seg fault)
which caused another process to detect a broken connection and hence it
aborted the program.
Rajeev
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Yasmine Chebaro
Sent: Monday, September 24, 2007 3:51 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] Program running with mpich2 stops
Hi all,
I am using mpich2 on a 64Bit Linux x86_64 machines, running in parrallel.
The program running with mpich2 is a mix of Fortran 90 and 77, with the
Intel Fortran Compiler.
I am actually having problems when I run my program with mpich2, in fact
after running for a couple of hours the program stops and the error message
is (ignore the four first lines they're the "normal" output of my program)
118 31.0400010000000
119 12.0100000000000
120 16.0000000000000
121 16.0000000000000
place holderplace holder
Image PC Routine Line Source
simulateur 0000000000E3F3CE Unknown Unknown Unknown
simulateur 0000000000E3E5CA Unknown Unknown Unknown
simulateur 0000000000DF9F62 Unknown Unknown Unknown
simulateur 0000000000DC99BA Unknown Unknown Unknown
simulateur 0000000000DC8F29 Unknown Unknown Unknown
simulateur 0000000000DDAB2B Unknown Unknown Unknown
simulateur 0000000000438804 Unknown Unknown Unknown
simulateur 0000000000457C86 Unknown Unknown Unknown
simulateur 000000000045462D Unknown Unknown Unknown
simulateur 0000000000411962 Unknown Unknown Unknown
libc.so.6 00002AE4FD3C34CA Unknown Unknown Unknown
simulateur 00000000004118AA Unknown Unknown Unknown
rank 1 in job 1 fargeau_44741 caused collective abort of all ranks
exit status of rank 1: return code 104
I really don't know what to do know. I thought it was due to some stack
overflow with the Intel Compiler (which happened to me before), but even
with the option that solves the stack overflow problem, it doesnt work.
If you have any idea why is this happening.
Thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070924/81ff18bf/attachment.htm>
More information about the mpich-discuss
mailing list