[MPICH] mpich2 problem with MCNP5

Vogt, Bastian Bastian.Vogt at iket.fzk.de
Mon Jul 25 06:26:31 CDT 2005


Dear colleagues,
 
I try to do some core calculations using the MPI parallel option of MCNP5. Therefore I have a input file with about more or less 70.000 cells.
It is running without any problems on my local machine using about 600MB of RAM. 
When I try to execute the same Job with MPI on one our more machines I get the following error:
 
 
With mpich:
cp0 =   4.21
forrtl: severe (170): Program Exception - stack overflow
 
Image              PC        Routine            Line        Source             
MCNP5mpi.exe       005AAD1A  Unknown               Unknown  Unknown
MCNP5mpi.exe       005A07DB  Unknown               Unknown  Unknown
MCNP5mpi.exe       004DE9AF  Unknown               Unknown  Unknown
MCNP5mpi.exe       004D2714  Unknown               Unknown  Unknown
MCNP5mpi.exe       005F7699  Unknown               Unknown  Unknown
MCNP5mpi.exe       005E9EBA  Unknown               Unknown  Unknown
kernel32.dll             7C816D4F  Unknown               Unknown  Unknown
Error 64, process 1, host IKET127154:
   GetQueuedCompletionStatus failed for socket 0 connected to host '141.52.127.137'
 
 
 
 
With mpich2:
cp0 =   4.23
forrtl: severe (170): Program Exception - stack overflow Image              PC        Routine            Line        Source             
MCNP5mpi.exe       005AAD1A  Unknown               Unknown  Unknown
MCNP5mpi.exe       005A07DB  Unknown               Unknown  Unknown
MCNP5mpi.exe       004DE9AF  Unknown               Unknown  Unknown
MCNP5mpi.exe       004D2714  Unknown               Unknown  Unknown
MCNP5mpi.exe       005F7699  Unknown               Unknown  Unknown
MCNP5mpi.exe       005E9EBA  Unknown               Unknown  Unknown
kernel32.dll       7C816D4F  Unknown               Unknown  Unknown
 
I tried to run a smaller job on the cluster and its working fine. So it should not be a problem with a wrong configured MPI. All computers in the cluster have about 1GB of RAM, so it should not be the RAM either. I don't have any idea to overcome this problem so I would be glad about some help!!
 
Has anyone an idea to fix this problem?
 
Thank you all in advance!!
bastian
 
_______________________________________
 
Bastian Vogt
 
EnBW Kraftwerke AG 
 
Institut für Kern- und Energietechnik (IKET)
Forschungszentrum Karlsruhe 
Telefon: +49-(0)7247-82-5047
Fax.: +49-(0)7247-82-6323
 
 
_______________________________________
 
Bastian Vogt
 
EnBW Kraftwerke AG 
 
Institut für Kern- und Energietechnik (IKET)
Forschungszentrum Karlsruhe 
Telefon: +49-(0)7247-82-5047
Fax.: +49-(0)7247-82-6323
 
mailto: bastian.vogt at iket.fzk.de
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20050725/47352ebe/attachment.htm>


More information about the mpich-discuss mailing list