[mpich-discuss] Problem with MPI_Bcast
Hisham Adel
hosham2004 at yahoo.com
Wed Dec 15 10:34:40 CST 2010
Hi,
Thanks all for your replies... I found the problem, for some reasons, which i
don't know how this happened, I didn't find node10, node11,.. in the
"/etc/hosts" file. I have modified it again and everything works well now.
Thanks all for your reply.
Regards,
Hisham
________________________________
From: Gus Correa <gus at ldeo.columbia.edu>
To: Mpich Discuss <mpich-discuss at mcs.anl.gov>
Sent: Tue, December 14, 2010 6:46:56 PM
Subject: Re: [mpich-discuss] Problem with MPI_Bcast
Pavan Balaji wrote:
>
> On Tue, 14 Dec 2010, Hisham Adel wrote:
>> Thanks for your fast reply. The program runs well when I have removed
"node10"
>> and increased the number of processes.
>> Now, I don't know where is the problem with "node10". It has the same Linux
>> version, the same configuration and on the same network.
>>
>> Do you have any ideas ?
>
> Unfortunately, with the information we have, it's hard to tell what's wrong
>with the node. Could be a hardware issue; could be a network configuration
>issue. I don't really know.
>
> You might be able to run some tests with just two nodes in the host file
>(node00 and node10) and see what errors it throws. It gets harder to debug with
>11 nodes. Also try doing an ssh and ping from node00 to node10, and node10 to
>node00.
>
> I'm just randomly throwing out things you can try here. Maybe something will
>stick.
>
> -- Pavan
>
Hi Hisham
Have you tried to reboot node10?
Sometimes it is all it takes.
A quick and dirty way to restore a node to a sane state.
My two cents,
Gus Correa
_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101215/7b067521/attachment.htm>
More information about the mpich-discuss
mailing list