[MPICH] stdin buffer overflow problem?
    Benjamin Svetitsky 
    bqs at julian.tau.ac.il
       
    Sun Feb 24 09:44:12 CST 2008
    
    
  
Dear developers,
I've been running MPICH2 under RHEL on a cluster of Intel Core 2 Quad 
processors.  The installation went smoothly and it was up and running in 
a few minutes.  We've done a good deal of production in the past month, 
and I want to thank the developers.
Now the problem.  I run an MPI job via a command like
mpiexec -np 4 -host nodeA ../su3_hmc : -np 4 -host nodeB ../su3_hmc < 
inputfile > outputfile &
If the inputfile is too large, this leads to unpredictable behavior 
ending in a crash.  Typical results are:
1. No processes are ever initiated: mpdlistjobs gives the expected 
report of jobs running on the two hosts, but there are no actual 
processes reported by ps; an output file is created but it is zero bytes 
long.  The only solution in mpdkilljob.
2. Processes are created on one host but not the other; likewise there 
are no results.
3. All the processes are created, and they run for a few minutes and 
then hang up.
4. One or more of the hosts crashes and reboots.
There is no problem if inputfile is short; I run into trouble if the 
file is longer than a few K, certainly by 67K.  I interpret this as a 
buffer overflow -- what, exactly, does mpiexec do with its standard input?
This looks like a serious security problem.  I am running mpd as root, 
with MPD_USE_ROOT_MPD=1.  So this, I think, is how a buffer overflow can 
crash the entire node.
A successful workaround is to get the program to read from a file other 
than stdin.  I give the inout file as an argument to the program, and 
redirect stdin to /dev/null.  So the following runs successfully:
mpiexec -np 4 -host nodeA ../su3_hmc inputfile : -np 4 -host nodeB 
../su3_hmc inputfile < /dev/null > outputfile &
I recall seeing the same problem a few years ago with the old MPICH on 
an SGI Origin system.  So it's not a new bug.
Best regards,
		B. Svetitsky
-- 
Prof. Benjamin Svetitsky           Phone:  +972-3-640 8870
School of Physics and Astronomy    Fax:    +972-3-640 7932
Tel Aviv University                E-mail: bqs at julian.tau.ac.il
69978 Tel Aviv, Israel             WWW:    http://julian.tau.ac.il/~bqs
    
    
More information about the mpich-discuss
mailing list