[mpich-discuss] Getting runtime exception

Rajnish rajnish99 at gmail.com
Fri Feb 5 07:24:59 CST 2010


First thank you all a lot for providing help here.

I have two Linux SMP nodes, say n1 and n2.

n1 has Linux 2.1.18 with gcc 4.1.2 and n2 has Linux 2.1.20 with gcc 4.1.1
running. Both are on NFS and shh-accesible.

I got MPICH2 installed and running. *mpdringtest *with both nodes on the
ring runs fine.

When I schedule tasks on same node, they run fine.

However, when I schedule tasks across both nodes, with n2 as the master
node, I get the following message on n1:

mpd_uncaught_except_tb handling:
 exceptions.KeyError: 'process_mapping'
   /usr/local/bin/mpd  1354  do_mpdrun
       msg['process_mapping'][lorank] = self.myHost
   /usr/local/bin/mpd  984  handle_lhs_input
       self.do_mpdrun(msg)
   /usr/local/bin/mpdlib.py  780  handle_active_streams
       handler(stream,*args)
   /usr/local/bin/mpd  301  runmainloop
       rv = self.streamHandler.handle_active_streams(timeout=8.0)
   /usr/local/bin/mpd  270  run
       self.runmainloop()
   /usr/local/bin/mpd  1643  ?
       mpd.run()
n1-wulf.myhost.org_mpdman_1 (run 287): invalid msg from lhs; expecting
ringsize got: {}


After doing *mpdallexit*, n2 shows the following message:

mpiexec_n2-wulf.myhost.org (mpiexec 377): no msg recvd from mpd when
expecting ack of request
--------------------

My question: Do I need exactly same OS version on both n1 and n2, or same
gcc version on both, or I may have some other installation problem?

Thanks in advance,
- Rajnish.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100205/991b9a18/attachment.htm>


More information about the mpich-discuss mailing list