[MPICH] stdin buffer overflow problem?

Benjamin Svetitsky bqs at julian.tau.ac.il
Sun Feb 24 09:44:12 CST 2008


Dear developers,

I've been running MPICH2 under RHEL on a cluster of Intel Core 2 Quad 
processors.  The installation went smoothly and it was up and running in 
a few minutes.  We've done a good deal of production in the past month, 
and I want to thank the developers.

Now the problem.  I run an MPI job via a command like

mpiexec -np 4 -host nodeA ../su3_hmc : -np 4 -host nodeB ../su3_hmc < 
inputfile > outputfile &

If the inputfile is too large, this leads to unpredictable behavior 
ending in a crash.  Typical results are:

1. No processes are ever initiated: mpdlistjobs gives the expected 
report of jobs running on the two hosts, but there are no actual 
processes reported by ps; an output file is created but it is zero bytes 
long.  The only solution in mpdkilljob.

2. Processes are created on one host but not the other; likewise there 
are no results.

3. All the processes are created, and they run for a few minutes and 
then hang up.

4. One or more of the hosts crashes and reboots.

There is no problem if inputfile is short; I run into trouble if the 
file is longer than a few K, certainly by 67K.  I interpret this as a 
buffer overflow -- what, exactly, does mpiexec do with its standard input?

This looks like a serious security problem.  I am running mpd as root, 
with MPD_USE_ROOT_MPD=1.  So this, I think, is how a buffer overflow can 
crash the entire node.

A successful workaround is to get the program to read from a file other 
than stdin.  I give the inout file as an argument to the program, and 
redirect stdin to /dev/null.  So the following runs successfully:

mpiexec -np 4 -host nodeA ../su3_hmc inputfile : -np 4 -host nodeB 
../su3_hmc inputfile < /dev/null > outputfile &

I recall seeing the same problem a few years ago with the old MPICH on 
an SGI Origin system.  So it's not a new bug.

Best regards,
		B. Svetitsky

-- 
Prof. Benjamin Svetitsky           Phone:  +972-3-640 8870
School of Physics and Astronomy    Fax:    +972-3-640 7932
Tel Aviv University                E-mail: bqs at julian.tau.ac.il
69978 Tel Aviv, Israel             WWW:    http://julian.tau.ac.il/~bqs




More information about the mpich-discuss mailing list