<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
</style>
</head>
<body class='hmmessage'>
Only have two nodes.<BR>
<BR>
Node 1--> name: master --> hostname: wireless<BR>
Node 2--> name: slave----> hostname: wireless2<BR>
<BR>
For wake up the cluster i use the command "mpdboot".<BR>
<BR>
For example, i can to see how is the two node's id. In my job, i use for example MPI_rank(...) and i receive the number of the nodes, however if i use a MPI_Send(...) or MPI_Receive(...), mi job exit of the application and show me a error.<BR>
If i use "mpiexec -l -n 2 hostname", i receive:<BR>
0 : wireless<BR>
1: wireless 2<BR>
<BR>
I don't know that it is the answer for your question.<BR>
<BR>
Thanks.<BR><BR><BR>
<HR id=stopSpelling>
<BR>
From: thakur@mcs.anl.gov<BR>To: mpich-discuss@mcs.anl.gov<BR>Date: Mon, 2 Feb 2009 15:52:52 -0600<BR>Subject: Re: [mpich-discuss] Fatal error in MPI_Barrier<BR><BR><BR>
<STYLE>
.ExternalClass .EC_hmmessage P
{padding-right:0px;padding-left:0px;padding-bottom:0px;padding-top:0px;}
.ExternalClass BODY.EC_hmmessage
{font-size:10pt;font-family:Verdana;}
</STYLE>
<DIV dir=ltr align=left><SPAN class=EC_388285121-02022009><FONT face=Arial color=#0000ff>The error message "<FONT color=#000000>unable to find the process group structure with id <>" is odd. How exactly did you configure MPICH2? Were you able to set up an MPD ring on the two nodes successfully?</FONT></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=EC_388285121-02022009><FONT face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=EC_388285121-02022009><FONT face=Arial>Rajeev</FONT></SPAN></DIV><BR>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=EC_OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR>
<FONT face=Tahoma><B>From:</B> mpich-discuss-bounces@mcs.anl.gov [mailto:mpich-discuss-bounces@mcs.anl.gov] <B>On Behalf Of </B>Antonio José Gallardo Díaz<BR><B>Sent:</B> Monday, February 02, 2009 12:39 PM<BR><B>To:</B> mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> Re: [mpich-discuss] Fatal error in MPI_Barrier<BR></FONT><BR></DIV>
<DIV></DIV>Hello. I Have tested to use the command:<BR><BR>mpiexec -recvtimeout 30 -n 2 /home/mpi/mpich2-1.0.8/examples/cpi <BR><BR>and this is the result.<BR><BR>/********************************************************************************************************************************************************/ <BR>Process 0 of 2 is on wireless <BR>Process 1 of 2 is on wireless2 <BR>Fatal error in MPI_Bcast: Other MPI error, error stack: <BR>MPI_Bcast(786)............................: MPI_Bcast(buf=0x7ffff732586c, count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed <BR>MPIR_Bcast(230)...........................: <BR>MPIC_Send(39).............................: <BR>MPIC_Wait(270)............................: <BR>MPIDI_CH3i_Progress_wait(215).............: an error occurred while handling an event returned by MPIDU_Sock_Wait() <BR>MPIDI_CH3I_Progress_handle_sock_event(420):<BR>MPIDU_Socki_handle_read(637)..............: connection failure (set=0,sock=1,errno=104:Connection reset by peer)[cli_0]: aborting job:<BR>Fatal error in MPI_Bcast: Other MPI error, error stack:<BR>MPI_Bcast(786)............................: MPI_Bcast(buf=0x7ffff732586c, count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed<BR>MPIR_Bcast(230)...........................:<BR>MPIC_Send(39).............................:<BR>MPIC_Wait(270)............................:<BR>MPIDI_CH3i_Progress_wait(215).............: an error occurred while handling an event returned by MPFatal error in MPI_Bcast: Other MPI error, error stack:<BR>MPI_Bcast(786)...............................: MPI_Bcast(buf=0xbf82bec8, count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed<BR>MPIR_Bcast(198)..............................:<BR>MPIC_Recv(81)................................:<BR>MPIC_Wait(270)...............................:<BR>MPIDI_CH3i_Progress_wait(215)................: an error occurred while handling an event returned by MPIDU_Sock_Wait()<BR>MPIDI_CH3I_Progress_handle_sock_event(640)...:<BR>MPIDI_CH3_Sockconn_handle_connopen_event(887): unable to find the process group structure with id <>[cli_1]: aborting job:<BR>Fatal error in MPI_Bcast: Other MPI error, error stack:<BR>MPI_Bcast(786)...............................: MPI_Bcast(buf=0xbf82bec8, count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed<BR>MPIR_Bcast(198)..............................:<BR>MPIC_Recv(81)................................:<BR>MPIC_Wait(270)...............................:<BR>MPIDI_CH3i_Progress_wait(215)................: an error occurred while handling an event rIDU_Sock_Wait()<BR>MPIDI_CH3I_Progress_handle_sock_event(420):<BR>MPIDU_Socki_handle_read(637)..............: connection failure (set=0,sock=1,errno=104:Connection reset by peer)<BR>eturned by MPIDU_Sock_Wait()<BR>MPIDI_CH3I_Progress_handle_sock_event(640)...:<BR>MPIDI_CH3_Sockconn_handle_connopen_event(887): unable to find the process group structure with id <><BR>rank 1 in job 21 wireless_47695 caused collective abort of all ranks<BR> exit status of rank 1: return code 1<BR>rank 0 in job 21 wireless_47695 caused collective abort of all ranks<BR> exit status of rank 0: return code 1<BR><BR>/********************************************************************************************************************************************************/<BR><BR>The mpdcheck said that has a problem with the first ip but it's solved.<BR>I tested:<BR><BR>mpdcheck -s and in the other node mpdcheck -c "name" "number" --------------> Well.<BR>mpiexec -n 1 /bin/hostname -------------------------------------------------------------------------------------------------------------> Well.<BR>mpiexec -l -n 4 /bin/hostname ----------------------------------------------------------------------------------------------------------> Well.<BR><BR>I have to say that with all command i have to put the options -recvtimeout 30 because but have problems. Without this option, say me:<BR><BR>mpiexec_wireless (mpiexec 392): no msg recvd from mpd when expecting ack of request<BR><BR><BR>What can i do?? Please help and sorry for my poor english.<BR><BR><BR><BR>
<HR id=EC_stopSpelling>
From: ajcampa@hotmail.com<BR>To: mpich-discuss@mcs.anl.gov<BR>Date: Mon, 2 Feb 2009 18:17:39 +0100<BR>Subject: Re: [mpich-discuss] Fatal error in MPI_Barrier<BR><BR>
<STYLE>
.ExternalClass .EC_hmmessage P
{padding-right:0px;padding-left:0px;padding-bottom:0px;padding-top:0px;}
.ExternalClass BODY.EC_hmmessage
{font-size:10pt;font-family:Verdana;}
</STYLE>
Well, thanks for your answer. Really, the name of mi pc is "Wireless" and the othes pc "Wireless2", i use in the two pc, the same user "mpi". <BR><BR>I will try the mpdchech utility and then write something.<BR><BR>Thank for all.<BR><BR>Un saludo desde España.<BR><BR>
<HR id=EC_EC_stopSpelling>
From: thakur@mcs.anl.gov<BR>To: mpich-discuss@mcs.anl.gov<BR>Date: Mon, 2 Feb 2009 10:55:03 -0600<BR>Subject: Re: [mpich-discuss] Fatal error in MPI_Barrier<BR><BR>
<STYLE>
.ExternalClass .EC_hmmessage P
{padding-right:0px;padding-left:0px;padding-bottom:0px;padding-top:0px;}
.ExternalClass BODY.EC_hmmessage
{font-size:10pt;font-family:Verdana;}
</STYLE>
<DIV dir=ltr align=left><SPAN class=EC_EC_EC_811165316-02022009><FONT face=Arial color=#0000ff>Are you really trying to use the wireless network? Looks like that's what is getting used.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=EC_EC_EC_811165316-02022009><FONT face=Arial color=#0000ff></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=EC_EC_EC_811165316-02022009><FONT face=Arial color=#0000ff>You can use the mpdcheck utility to diagnose network configuration problems. See Appendix A.2 of the installation guide.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=EC_EC_EC_811165316-02022009><FONT face=Arial color=#0000ff></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=EC_EC_EC_811165316-02022009><FONT face=Arial color=#0000ff>Rajeev</FONT></SPAN></DIV><BR>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px">
<DIV class=EC_EC_EC_OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR>
<FONT face=Tahoma><B>From:</B> mpich-discuss-bounces@mcs.anl.gov [mailto:mpich-discuss-bounces@mcs.anl.gov] <B>On Behalf Of </B>Antonio José Gallardo Díaz<BR><B>Sent:</B> Monday, February 02, 2009 9:49 AM<BR><B>To:</B> mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> [mpich-discuss] Fatal error in MPI_Barrier<BR></FONT><BR></DIV>
<DIV></DIV>Hello, this error show me when i try my jobs that use MPI.<BR><BR><BR>Fatal error in MPI_Barrier: Other MPI error, error stack:<BR>MPI_Barrier(406).............................: MPI_Barrier(MPI_COMM_WORLD) failed<BR>MPIR_Barrier(77).............................:<BR>MPIC_Sendrecv(123)...........................:<BR>MPIC_Wait(270)...............................:<BR>MPIDI_CH3i_Progress_wait(215)................: an error occurred while handling an event returned by MPIDU_Sock_Wait()<BR>MPIDI_CH3I_Progress_handle_sock_event(640)...:<BR>MPIDI_CH3_Sockconn_handle_connopen_event(887): unable to find the process group structure with id <��oz�>[cli_1]: aborting job:<BR>Fatal error in MPI_Barrier: Other MPI error, error stack:<BR>MPI_Barrier(406).............................: MPI_Barrier(MPI_COMM_WORLD) failed<BR>MPIR_Barrier(77).............................:<BR>MPIC_Sendrecv(123)...........................:<BR>MPIC_Wait(270)...............................:<BR>MPIDI_CH3i_Progress_wait(215)................: an error occurred while handling an event returned by MPIDU_Sock_Wait()<BR>MPIDI_CH3I_Progress_handle_sock_event(640)...:<BR>MPIDI_CH3_Sockconn_handle_connopen_event(887): unable to find the process group structure with id <��oz�><BR>rank 1 in job 15 wireless_43226 caused collective abort of all ranks<BR> exit status of rank 1: killed by signal 9<BR><BR>I have two PC's with linux (kubuntu 8.10). I make a cluster using this machines. When use for example the command "mpiexec -l -n 2 hostname" i can see that it's all right, but when i try to send o receive some thing i have the same error. I don't know why. Please i need one hand. Thanks for all. <BR><BR>
<HR>
El doble de diversión: <A href="http://www.microsoft.com/windows/windowslive/messenger.aspx">Con Windows Live Messenger comparte fotos mientras hablas.</A> </BLOCKQUOTE><BR>
<HR>
Con el nuevo Windows Live lo tendrás <A href="http://home.live.com/">todo al alcance de tu mano</A><BR>
<HR>
Con el nuevo Windows Live lo tendrás <A href="http://home.live.com/">todo al alcance de tu mano</A> </BLOCKQUOTE><br /><hr />Tienes un nuevo Messenger por descubrir. <a href='http://download.live.com/' target='_new'>¡Descárgatelo! </a></body>
</html>