[mpich-discuss] problem with running mpd on different nodes
Vlad Cojocaru
Vlad.Cojocaru at eml-r.villa-bosch.de
Thu Jul 17 05:08:55 CDT 2008
Dear MPICH2 users,
Yesterday I have compiled mpich2 1.0.7 on a machine called node-06-01
(64 bits opteron). I am properly running my mpi application on this
machine and another one 06-02. However, when I tried to go to a
different node 05-02, I started mpd & but mpdtrace does not retrieve the
name of the host anymore (as it does on 06-01 and 06-02). Instead I get
errors like the one below.
node-05-02 is a similar machine. My ~/mpd.conf file is visible from all
nodes.
Looking at the mpdtrace python script I noticed that for this machine he
msg{} at line 57 is empty while on both 06-01 and 06-02 is not an empty
string. The problem appears to be located somewhere in the recv_dict_msg
function. However I am not very good with python so I was not able to
detect the problem.
Does anybody have any idea how to solve this ?
Thanks
vlad
----------------error-----------------------
Alarm clock
node-05-02_42768 (mpd_sockpair 226): connect 110 Connection timed out
node-05-02_42768 (mpd_sockpair 233): connect error with 110 Connection
timed out
node-05-02_42768 (mpd_sockpair 244): connect 22 Invalid argument
node-05-02_42768: mpd_uncaught_except_tb handling:
socket.error: (22, 'Invalid argument')
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py
245 mpd_sockpair
raise socket.error, errinfo
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py
802 create_single_mem_ring
self.lhsSock,self.rhsSock = mpd_sockpair()
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py
848 enter_ring
rhsHandler=rhsHandler)
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 250 run
rhsHandler=self.handle_rhs_input)
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 1492 ?
mpd.run()
node-05-02_49229: mpd_uncaught_except_tb handling:
exceptions.OSError: [Errno 2] No such file or directory:
'/tmp/mpd2.console_cojocavd'
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py
1128 __init__
os.unlink(self.conFilename)
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 237 run
self.conListenSock =
MPDConListenSock(secretword=self.parmdb['MPD_SECRETWORD'])
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 1492 ?
mpd.run()
node-05-02_43805: mpd_uncaught_except_tb handling:
exceptions.OSError: [Errno 2] No such file or directory:
'/tmp/mpd2.console_cojocavd'
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py
1128 __init__
os.unlink(self.conFilename)
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 237 run
self.conListenSock =
MPDConListenSock(secretword=self.parmdb['MPD_SECRETWORD'])
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 1492 ?
mpd.run()
node-05-02_42949: mpd_uncaught_except_tb handling:
exceptions.OSError: [Errno 2] No such file or directory:
'/tmp/mpd2.console_cojocavd'
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py
1128 __init__
os.unlink(self.conFilename)
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 237 run
self.conListenSock =
MPDConListenSock(secretword=self.parmdb['MPD_SECRETWORD'])
/scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd 1492 ?
mpd.run()
--
----------------------------------------------------------------------------
Dr. Vlad Cojocaru
EML Research gGmbH
Schloss-Wolfsbrunnenweg 33
69118 Heidelberg
Tel: ++49-6221-533266
Fax: ++49-6221-533298
e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de
http://projects.villa-bosch.de/mcm/people/cojocaru/
----------------------------------------------------------------------------
EML Research gGmbH
Amtgericht Mannheim / HRB 337446
Managing Partner: Dr. h.c. Klaus Tschira
Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
http://www.eml-r.org
----------------------------------------------------------------------------
More information about the mpich-discuss
mailing list