[mpich-discuss] Are there limits to MPICH2 Jobs?

Attieh, Ibrahim attiehi at aecl.ca
Thu May 8 16:41:08 CDT 2008


Lester,
 
I do not think the RAM is the issue because we are using CPUs with 16 GB RAM.
 
It could be some limitation within MCNP or some other parameter.
 
Thank you for your help.
 
Ibrahim

-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov]On Behalf Of Cavey, Lester
Sent: May 8, 2008 11:10 AM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?



Ibrahim,

 

As far as the stack goes, you could buy more RAM memory, if it is not too expensive, and your machine can upgrade to more RAM.  Our test machines only allowed us to upgrade to 4 GB of RAM.  The upgrade wasn't too expensive, and it let us run bigger jobs.  It just depends on what type of machine you have.

 

Lester


  _____  


From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Attieh, Ibrahim
Sent: Thursday, May 08, 2008 10:27 AM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?

 

Thank you Lester and Rajeev for your feedback.

 

We change the stack size to unlimited.  We were able to run a 4.4 GB job.  However, when we tried to run a 6.7 GB job we got the following error:

 

master sending static commons...

master sending dynamic commons...

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1917782060)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1891740044)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1871790948)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1871396372)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1871396340)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1858966996)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777645156)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777447868)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777250572)

[pe:0] **DOTCOMM Error** DOTCOMMI_PACK (dotcommi_pack.c:104[!(( dotcommp_sbuf.data != ((void *)0) ))]) (alloc(dotcommp_sbuf.data)) (1777250568)

[cli_0]: aborting job:

Fatal error in MPI_Bcast: Invalid buffer pointer, error stack:

MPI_Bcast(784): MPI_Bcast(buf=(nil), count=1855241236, MPI_BYTE, root=0, comm=0x84000000) failed

MPI_Bcast(735): Null buffer pointer

Terminated

Signal Received, exiting ...

 

 

Are there parameters in the environment that we need to change o ther than the stack limit?

 

Your feedback is appreciated.

 

Ibrahim 

-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov]On Behalf Of Rajeev Thakur
Sent: May 3, 2008 8:49 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?

Nothing in MPICH2 should prevent the job from running on > 2GB memory. The problem is likely to be in the environment running MPICH2, such as the stack size limit Lester mentions.

 

Rajeev

 

 


  _____  


From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Cavey, Lester
Sent: Saturday, May 03, 2008 3:33 PM
To: mpich-discuss at mcs.anl.gov
Subject: RE: [mpich-discuss] Are there limits to MPICH2 Jobs?

Ibrahim,

 

I'm not an expert on this, but we used the ulimit command, in our bash rc files, to increase our stack size limit.  Then our 32-bit linux system was able iterate through more loops while running our application (a particle-in-cell simulator).  Eventually though there was a limit to the number of loops that ran, before a "segmentation fault" was reported.  In our case, this meant we had a stack overflow, since there is always going to be a maximum stack size that can not be exceeded.

 

I'm sure the ANL guys can give you more advice, besides just trying the ulimit command.

 

Lester


  _____  


From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Attieh, Ibrahim
Sent: Saturday, May 03, 2008 12:03 PM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] Are there limits to MPICH2 Jobs?

 

Hello, 

We are trying to run the Monte Carlo code MCNP with MPICH2.  However, we are having problems with jobs bigger than 2 gigabytes. The jobs we want to run are as big as 9 gigabytes.  We are running the jobs on a linux system with 64-bit operating system.

Are there any limits to the size of the files that MPICH2 could handle?  Are there any options that we need to set for MPICH2 to handle such large jobs?

Your help would be greatly appreciated. 

Thanks, 
Ibrahim Attieh 

 



CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE

This e-mail, and any attachments, may contain information that
is confidential, subject to copyright, or exempt from disclosure.
Any unauthorized review, disclosure, retransmission, 
dissemination or other use of or reliance on this information 
may be unlawful and is strictly prohibited.  

AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE

Le présent courriel, et toute pièce jointe, peut contenir de 
l'information qui est confidentielle, régie par les droits 
d'auteur, ou interdite de divulgation. Tout examen, 
divulgation, retransmission, diffusion ou autres utilisations 
non autorisées de l'information ou dépendance non autorisée 
envers celle-ci peut être illégale et est strictement interdite.

 

 



CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE

This e-mail, and any attachments, may contain information that
is confidential, subject to copyright, or exempt from disclosure.
Any unauthorized review, disclosure, retransmission, 
dissemination or other use of or reliance on this information 
may be unlawful and is strictly prohibited.  

AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE

Le présent courriel, et toute pièce jointe, peut contenir de 
l'information qui est confidentielle, régie par les droits 
d'auteur, ou interdite de divulgation. Tout examen, 
divulgation, retransmission, diffusion ou autres utilisations 
non autorisées de l'information ou dépendance non autorisée 
envers celle-ci peut être illégale et est strictement interdite.

 

CONFIDENTIAL AND PRIVILEGED INFORMATION NOTICE

This e-mail, and any attachments, may contain information that
is confidential, subject to copyright, or exempt from disclosure.
Any unauthorized review, disclosure, retransmission, 
dissemination or other use of or reliance on this information 
may be unlawful and is strictly prohibited.  

AVIS D'INFORMATION CONFIDENTIELLE ET PRIVILÉGIÉE

Le présent courriel, et toute pièce jointe, peut contenir de 
l'information qui est confidentielle, régie par les droits 
d'auteur, ou interdite de divulgation. Tout examen, 
divulgation, retransmission, diffusion ou autres utilisations 
non autorisées de l'information ou dépendance non autorisée 
envers celle-ci peut être illégale et est strictement interdite.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080508/50ed76f1/attachment.htm>


More information about the mpich-discuss mailing list