AW: [MPICH] mpich2 problem with MCNP5
Vogt, Bastian
Bastian.Vogt at iket.fzk.de
Mon Aug 1 03:57:32 CDT 2005
Hi David,
in fact I did not compile the application since MCNP5 comes with a precompiled MPI executable!
But I fixed the problem by patching the MCNP5_MPI.EXE with the editbin tool (MICROSOFT VISUAL STUDIO) to a bigger stack size.
Now a new problem is showing up:
When MPI initializes the slave processes I get the error message:
master starting 4 tasks with 1 threads each 07/29/05 16:03:38
master sending static commons...
master sending dynamic commons...
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (C:\mcnp\Install_RSICC_1.30\MCNP5\Source\dotcomm\src\internals\mpi\dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (-280776228)
I figured out that this routine "dotcommi_pack.c" allocates the size of the send buffer size. So this is a file from the MCNP installation and not a MPICH file. But anyway, if you have any suggestion to fix the problem, please send me an e-mail.
Thanks in advance
bastian
_______________________________________
Bastian Vogt
EnBW Kraftwerke AG
Institut für Kern- und Energietechnik (IKET)
Forschungszentrum Karlsruhe
Telefon: +49-(0)7247-82-5047
Fax.: +49-(0)7247-82-6323
mailto: bastian.vogt at iket.fzk.de
-----Ursprüngliche Nachricht-----
Von: David Ashton [mailto:ashton at mcs.anl.gov]
Gesendet: Dienstag, 26. Juli 2005 02:42
An: Vogt, Bastian; mpich-discuss at mcs.anl.gov
Betreff: RE: [MPICH] mpich2 problem with MCNP5
bastian,
The error for MPICH2 states that there is a stack overflow. I suspect this is a problem with the way the fortran application was compiled. I don't remember the details off the top of my head but I believe if you tell the compiler to increase the amount of global memory so you can have a large static array for instance, then this also affects the amount of memory available for the thread stacks. The only way to resolve it is to use memory off the heap instead of large static variables and not increase the global memory space.
I may be way off my diagnosis. What parameters to the compiler or #pragmas did you use to compile the application?
-David Ashton
_____
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Vogt, Bastian
Sent: Monday, July 25, 2005 5:27 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] mpich2 problem with MCNP5
Dear colleagues,
I try to do some core calculations using the MPI parallel option of MCNP5. Therefore I have a input file with about more or less 70.000 cells.
It is running without any problems on my local machine using about 600MB of RAM.
When I try to execute the same Job with MPI on one our more machines I get the following error:
With mpich:
cp0 = 4.21
forrtl: severe (170): Program Exception - stack overflow
Image PC Routine Line Source
MCNP5mpi.exe 005AAD1A Unknown Unknown Unknown
MCNP5mpi.exe 005A07DB Unknown Unknown Unknown
MCNP5mpi.exe 004DE9AF Unknown Unknown Unknown
MCNP5mpi.exe 004D2714 Unknown Unknown Unknown
MCNP5mpi.exe 005F7699 Unknown Unknown Unknown
MCNP5mpi.exe 005E9EBA Unknown Unknown Unknown
kernel32.dll 7C816D4F Unknown Unknown Unknown
Error 64, process 1, host IKET127154:
GetQueuedCompletionStatus failed for socket 0 connected to host '141.52.127.137'
With mpich2:
cp0 = 4.23
forrtl: severe (170): Program Exception - stack overflow Image PC Routine Line Source
MCNP5mpi.exe 005AAD1A Unknown Unknown Unknown
MCNP5mpi.exe 005A07DB Unknown Unknown Unknown
MCNP5mpi.exe 004DE9AF Unknown Unknown Unknown
MCNP5mpi.exe 004D2714 Unknown Unknown Unknown
MCNP5mpi.exe 005F7699 Unknown Unknown Unknown
MCNP5mpi.exe 005E9EBA Unknown Unknown Unknown
kernel32.dll 7C816D4F Unknown Unknown Unknown
I tried to run a smaller job on the cluster and its working fine. So it should not be a problem with a wrong configured MPI. All computers in the cluster have about 1GB of RAM, so it should not be the RAM either. I don't have any idea to overcome this problem so I would be glad about some help!!
Has anyone an idea to fix this problem?
Thank you all in advance!!
bastian
_______________________________________
Bastian Vogt
EnBW Kraftwerke AG
Institut für Kern- und Energietechnik (IKET)
Forschungszentrum Karlsruhe
Telefon: +49-(0)7247-82-5047
Fax.: +49-(0)7247-82-6323
_______________________________________
Bastian Vogt
EnBW Kraftwerke AG
Institut für Kern- und Energietechnik (IKET)
Forschungszentrum Karlsruhe
Telefon: +49-(0)7247-82-5047
Fax.: +49-(0)7247-82-6323
mailto: bastian.vogt at iket.fzk.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20050801/85379b86/attachment.htm>
More information about the mpich-discuss
mailing list