[mpich-discuss] mpdboot hangs, but ...

Scott Atchley atchley at myri.com
Mon Dec 14 09:37:05 CST 2009


Dave,

I can reproduce it at will. More info below. I am using two host  
(shower03 and shower04) with four cores each. I am launching from  
shower03. My hosts file is:

% cat hosts.mpd
shower03:4
shower04:4

I am calling mpdboot with:

% mpdboot -n 2 -f hosts.mpd --ncpus=4 --mpd=`which mpd` --rsh=ssh -v
running mpdallexit on shower04
LAUNCHED mpd on shower04  via
RUNNING: mpd on shower04
LAUNCHED mpd on shower03  via  shower04

I have strace and gdb backtrace for mpdboot below. It is still hung.  
Let me know if you want backtraces from either of the mpds.

Scott



(gdb) info threads
* 1 Thread 0x2aaaaaab7f90 (LWP 23351)  0x0000003d508c5f00 in  
__read_nocancel () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003d508c5f00 in __read_nocancel () from /lib64/libc.so.6
#1  0x0000003d5086b853 in _IO_file_xsgetn_internal () from /lib64/ 
libc.so.6
#2  0x0000003d50861c82 in fread () from /lib64/libc.so.6
#3  0x0000003d51846507 in ?? () from /usr/lib64/libpython2.4.so.1.0
#4  0x0000003d5189497a in PyEval_EvalFrame () from /usr/lib64/ 
libpython2.4.so.1.0
#5  0x0000003d51894426 in PyEval_EvalFrame () from /usr/lib64/ 
libpython2.4.so.1.0
#6  0x0000003d51894426 in PyEval_EvalFrame () from /usr/lib64/ 
libpython2.4.so.1.0
#7  0x0000003d518958a5 in PyEval_EvalCodeEx () from /usr/lib64/ 
libpython2.4.so.1.0
#8  0x0000003d518958f2 in PyEval_EvalCode () from /usr/lib64/ 
libpython2.4.so.1.0
#9  0x0000003d518b1f29 in ?? () from /usr/lib64/libpython2.4.so.1.0
#10 0x0000003d518b33d8 in PyRun_SimpleFileExFlags () from /usr/lib64/ 
libpython2.4.so.1.0
#11 0x0000003d518b980d in Py_Main () from /usr/lib64/libpython2.4.so.1.0
#12 0x0000003d5081d994 in __libc_start_main () from /lib64/libc.so.6
#13 0x0000000000400629 in _start ()
(gdb)



-------------- next part --------------
A non-text attachment was scrubbed...
Name: bt.txt.gz
Type: application/x-gzip
Size: 16649 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20091214/aa28d362/attachment-0001.bin>
-------------- next part --------------



More information about the mpich-discuss mailing list