[mpich-discuss] Problem with MPI_Bcast
Pavan Balaji
balaji at mcs.anl.gov
Tue Dec 14 08:16:22 CST 2010
On Tue, 14 Dec 2010, Hisham Adel wrote:
> Thanks for your fast reply. The program runs well when I have removed "node10"
> and increased the number of processes.
> Now, I don't know where is the problem with "node10". It has the same Linux
> version, the same configuration and on the same network.
>
> Do you have any ideas ?
Unfortunately, with the information we have, it's hard to tell what's
wrong with the node. Could be a hardware issue; could be a network
configuration issue. I don't really know.
You might be able to run some tests with just two nodes in the host file
(node00 and node10) and see what errors it throws. It gets harder to debug
with 11 nodes. Also try doing an ssh and ping from node00 to node10, and
node10 to node00.
I'm just randomly throwing out things you can try here. Maybe something
will stick.
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list