[mpich-discuss] problem with running mpd on different nodes

Vlad Cojocaru Vlad.Cojocaru at eml-r.villa-bosch.de
Thu Jul 17 05:08:55 CDT 2008


Dear MPICH2 users,

Yesterday I have compiled mpich2 1.0.7 on a machine called node-06-01 
(64 bits opteron). I am properly  running my mpi application on this 
machine and another one 06-02. However, when I tried to go to a 
different node 05-02, I started mpd & but mpdtrace does not retrieve the 
name of the host anymore (as it does on 06-01 and 06-02). Instead I get 
errors like the one below.

node-05-02 is a similar machine. My ~/mpd.conf file is visible from all 
nodes.
Looking at the mpdtrace python script I noticed that for this machine he 
msg{} at line 57 is empty while on both 06-01 and 06-02 is not an empty 
string. The problem appears to be located somewhere in the recv_dict_msg 
function. However I am not very good with python so I was not able to 
detect the problem.

Does anybody have any idea how to solve this ?

Thanks
vlad


----------------error-----------------------
Alarm clock
node-05-02_42768 (mpd_sockpair 226): connect 110 Connection timed out
node-05-02_42768 (mpd_sockpair 233): connect error with 110 Connection 
timed out
node-05-02_42768 (mpd_sockpair 244): connect 22 Invalid argument
node-05-02_42768: mpd_uncaught_except_tb handling:
  socket.error: (22, 'Invalid argument')
    
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py  
245  mpd_sockpair
        raise socket.error, errinfo
    
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py  
802  create_single_mem_ring
        self.lhsSock,self.rhsSock = mpd_sockpair()
    
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py  
848  enter_ring
        rhsHandler=rhsHandler)
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  250  run
        rhsHandler=self.handle_rhs_input)
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  1492  ?
        mpd.run()
node-05-02_49229: mpd_uncaught_except_tb handling:
  exceptions.OSError: [Errno 2] No such file or directory: 
'/tmp/mpd2.console_cojocavd'
    
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py  
1128  __init__
        os.unlink(self.conFilename)
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  237  run
        self.conListenSock = 
MPDConListenSock(secretword=self.parmdb['MPD_SECRETWORD'])
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  1492  ?
        mpd.run()
node-05-02_43805: mpd_uncaught_except_tb handling:
  exceptions.OSError: [Errno 2] No such file or directory: 
'/tmp/mpd2.console_cojocavd'
    
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py  
1128  __init__
        os.unlink(self.conFilename)
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  237  run
        self.conListenSock = 
MPDConListenSock(secretword=self.parmdb['MPD_SECRETWORD'])
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  1492  ?
        mpd.run()
node-05-02_42949: mpd_uncaught_except_tb handling:
  exceptions.OSError: [Errno 2] No such file or directory: 
'/tmp/mpd2.console_cojocavd'
    
/scratch/node-06-01/cojocavd/Software/mpich2-1.0.7-install/bin/mpdlib.py  
1128  __init__
        os.unlink(self.conFilename)
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  237  run
        self.conListenSock = 
MPDConListenSock(secretword=self.parmdb['MPD_SECRETWORD'])
    /scratch/node-06-01/cojocavd/Software/mpich2/bin/mpd  1492  ?
        mpd.run()


-- 
----------------------------------------------------------------------------
Dr. Vlad Cojocaru

EML Research gGmbH
Schloss-Wolfsbrunnenweg 33
69118 Heidelberg

Tel: ++49-6221-533266
Fax: ++49-6221-533298

e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de

http://projects.villa-bosch.de/mcm/people/cojocaru/

----------------------------------------------------------------------------
EML Research gGmbH
Amtgericht Mannheim / HRB 337446
Managing Partner: Dr. h.c. Klaus Tschira
Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
http://www.eml-r.org
----------------------------------------------------------------------------





More information about the mpich-discuss mailing list