[mpich2-dev] Problem with MPI_Bcast

Pavan Balaji balaji at mcs.anl.gov
Tue Dec 14 08:16:22 CST 2010


On Tue, 14 Dec 2010, Hisham Adel wrote:
> Thanks for your fast reply. The program runs well when I have removed "node10"
> and increased the number of processes.
> Now, I don't know where is the problem with "node10". It has the same Linux
> version, the same configuration and on the same network.
>
> Do you have any ideas ?

Unfortunately, with the information we have, it's hard to tell what's 
wrong with the node. Could be a hardware issue; could be a network 
configuration issue. I don't really know.

You might be able to run some tests with just two nodes in the host file 
(node00 and node10) and see what errors it throws. It gets harder to debug 
with 11 nodes. Also try doing an ssh and ping from node00 to node10, and 
node10 to node00.

I'm just randomly throwing out things you can try here. Maybe something 
will stick.

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich2-dev mailing list