[mpich-discuss] connect -2 Name or service not known

David R Perticone perticone at MIT.EDU
Tue Dec 15 14:16:22 CST 2009


Hello-

I have been using mpich2 1.0.3 for some time without difficulty. Recently I
switched to 1.1.1 and we renamed our cluster nodes and changed their IP
addresses. On some nodes I get the connect -2
Error when launching mpd, however once the chain is built I almost always get
lots of these errors durring running mpiexec as shown below:


fc6h10_33392 (mpd_sockpair 240): connect -2 Name or service not known
fc6h10_33392 (mpd_sockpair 247): connect error with -2 Name or service not known
fc6h9_33839 (mpd_sockpair 240): connect -2 Name or service not known
fc6h9_33839 (mpd_sockpair 247): connect error with -2 Name or service not known

These nodes & ports clearly exist as shown from mpdtrace -l:
fc6h10_33392 (148.104.130.184)
fc6h9_33839 (148.104.130.183)

Does anyone know what casues this error and how to best debug it? I find
that I habe to continually do mpdallexit & rebuild chain after each block of
runs which is tedious. Thanks

drp





More information about the mpich-discuss mailing list