[MPICH] stdin buffer overflow problem?

Rajeev Thakur thakur at mcs.anl.gov
Mon Feb 25 18:35:11 CST 2008


Yes, the MPD process manager doesn't handle large input files via stdin very
well. In such cases, you will need to read from a file as you have.

Rajeev 

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of 
> Benjamin Svetitsky
> Sent: Sunday, February 24, 2008 9:44 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] stdin buffer overflow problem?
> 
> Dear developers,
> 
> I've been running MPICH2 under RHEL on a cluster of Intel 
> Core 2 Quad processors.  The installation went smoothly and 
> it was up and running in a few minutes.  We've done a good 
> deal of production in the past month, and I want to thank the 
> developers.
> 
> Now the problem.  I run an MPI job via a command like
> 
> mpiexec -np 4 -host nodeA ../su3_hmc : -np 4 -host nodeB 
> ../su3_hmc < inputfile > outputfile &
> 
> If the inputfile is too large, this leads to unpredictable 
> behavior ending in a crash.  Typical results are:
> 
> 1. No processes are ever initiated: mpdlistjobs gives the 
> expected report of jobs running on the two hosts, but there 
> are no actual processes reported by ps; an output file is 
> created but it is zero bytes long.  The only solution in mpdkilljob.
> 
> 2. Processes are created on one host but not the other; 
> likewise there are no results.
> 
> 3. All the processes are created, and they run for a few 
> minutes and then hang up.
> 
> 4. One or more of the hosts crashes and reboots.
> 
> There is no problem if inputfile is short; I run into trouble 
> if the file is longer than a few K, certainly by 67K.  I 
> interpret this as a buffer overflow -- what, exactly, does 
> mpiexec do with its standard input?
> 
> This looks like a serious security problem.  I am running mpd 
> as root, with MPD_USE_ROOT_MPD=1.  So this, I think, is how a 
> buffer overflow can crash the entire node.
> 
> A successful workaround is to get the program to read from a 
> file other than stdin.  I give the inout file as an argument 
> to the program, and redirect stdin to /dev/null.  So the 
> following runs successfully:
> 
> mpiexec -np 4 -host nodeA ../su3_hmc inputfile : -np 4 -host 
> nodeB ../su3_hmc inputfile < /dev/null > outputfile &
> 
> I recall seeing the same problem a few years ago with the old 
> MPICH on an SGI Origin system.  So it's not a new bug.
> 
> Best regards,
> 		B. Svetitsky
> 
> -- 
> Prof. Benjamin Svetitsky           Phone:  +972-3-640 8870
> School of Physics and Astronomy    Fax:    +972-3-640 7932
> Tel Aviv University                E-mail: bqs at julian.tau.ac.il
> 69978 Tel Aviv, Israel             WWW:    
> http://julian.tau.ac.il/~bqs
> 
> 




More information about the mpich-discuss mailing list