[mpich-discuss] Are there limits to MPICH2 Jobs?
Cavey, Lester
Lester.Cavey at ATK.COM
Thu May 8 10:10:02 CDT 2008
Ibrahim,
As far as the stack goes, you could buy more RAM memory, if it is not too expensive, and your machine can upgrade to more RAM. Our test machines only allowed us to upgrade to 4 GB of RAM. The upgrade wasn't too expensive, and it let us run bigger jobs. It just depends on what type of machine you have.
Lester
________________________________
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Attieh, Ibrahim
Sent: Thursday, May 08, 2008 10:27 AM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?
Thank you Lester and Rajeev for your feedback.
We change the stack size to unlimited. We were able to run a 4.4 GB job. However, when we tried to run a 6.7 GB job we got the following error:
master sending static commons...
master sending dynamic commons...
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1917782060)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1891740044)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1871790948)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1871396372)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1871396340)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1858966996)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777645156)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777447868)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777250572)
[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777250568)
[cli_0]: aborting job:
Fatal error in MPI_Bcast: Invalid buffer pointer, error stack:
MPI_Bcast(784): MPI_Bcast(buf=(nil), count=1855241236, MPI_BYTE, root=0, comm=0x84000000) failed
MPI_Bcast(735): Null buffer pointer
Terminated
Signal Received, exiting ...
Are there parameters in the environment that we need to change o ther than the stack limit?
Your feedback is appreciated.
Ibrahim
-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov]On Behalf Of Rajeev Thakur
Sent: May 3, 2008 8:49 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?
Nothing in MPICH2 should prevent the job from running on > 2GB memory. The problem is likely to be in the environment running MPICH2, such as the stack size limit Lester mentions.
Rajeev
________________________________
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Cavey, Lester
Sent: Saturday, May 03, 2008 3:33 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?
Ibrahim,
I'm not an expert on this, but we used the ulimit command, in our bash rc files, to increase our stack size limit. Then our 32-bit linux system was able iterate through more loops while running our application (a particle-in-cell simulator). Eventually though there was a limit to the number of loops that ran, before a "segmentation fault" was reported. In our case, this meant we had a stack overflow, since there is always going to be a maximum stack size that can not be exceeded.
I'm sure the ANL guys can give you more advice, besides just trying the ulimit command.
Lester
________________________________
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Attieh, Ibrahim
Sent: Saturday, May 03, 2008 12:03 PM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] Are there limits to MPICH2 Jobs?
Hello,
We are trying to run the Monte Carlo code MCNP with MPICH2. However, we are having problems with jobs bigger than 2 gigabytes. The jobs we want to run are as big as 9 gigabytes. We are running the jobs on a linux system with 64-bit operating system.
Are there any limits to the size of the files that MPICH2 could handle? Are there any options that we need to set for MPICH2 to handle such large jobs?
Your help would be greatly appreciated.
Thanks,
Ibrahim Attieh
CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE
This e-mail, and any attachments, may contain information that
is confidential, subject to copyright, or exempt from disclosure.
Any unauthorized review, disclosure, retransmission,
dissemination or other use of or reliance on this information
may be unlawful and is strictly prohibited.
AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE
Le présent courriel, et toute pièce jointe, peut contenir de
l'information qui est confidentielle, régie par les droits
d'auteur, ou interdite de divulgation. Tout examen,
divulgation, retransmission, diffusion ou autres utilisations
non autorisées de l'information ou dépendance non autorisée
envers celle-ci peut être illégale et est strictement interdite.
CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE
This e-mail, and any attachments, may contain information that
is confidential, subject to copyright, or exempt from disclosure.
Any unauthorized review, disclosure, retransmission,
dissemination or other use of or reliance on this information
may be unlawful and is strictly prohibited.
AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE
Le présent courriel, et toute pièce jointe, peut contenir de
l'information qui est confidentielle, régie par les droits
d'auteur, ou interdite de divulgation. Tout examen,
divulgation, retransmission, diffusion ou autres utilisations
non autorisées de l'information ou dépendance non autorisée
envers celle-ci peut être illégale et est strictement interdite.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080508/fd2d6972/attachment.htm>
More information about the mpich-discuss
mailing list