<table cellspacing="0" cellpadding="0" border="0" ><tr><td valign="top" style="font: inherit;">yes, mpdtrace show both node. I use version mpich2-1.1 from download page. My OS is REHL 5. when I run my program, if MPI_Send and MPI_Recv are not used, it worked. printf or cout from both machines appear in my console normally. <br><br>--- On <b>Fri, 7/17/09, Pavan Balaji <i><balaji@mcs.anl.gov></i></b> wrote:<br><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;"><br>From: Pavan Balaji <balaji@mcs.anl.gov><br>Subject: Re: [mpich-discuss] I wonder if my mpdboot is the cause of problem...help me!<br>To: mpich-discuss@mcs.anl.gov<br>Date: Friday, July 17, 2009, 10:12 PM<br><br><div class="plainMail"><br>Does mpdtrace show both nodes? Which version of MPICH2 are you using?<br><br> -- Pavan<br><br>On 07/18/2009 12:05 AM, Gra zeus wrote:<br>> hello,<br>> <br>> thx for answer yesterday.<br>>
I tested my code in one machine (with"mpiexec -n 2 ./myprog"),everything work fine - my program can use MPI_Send,MPI_Recv without any problems.<br>> <br>> today, I setup mpich2 on two machines. Both machines can communicate with others, ssh are tested on both machines, mpd work, mpdringtest work.<br>> <br>> however,when i run my program that use MPI_Send and MPI_Recv, MPI_Recv is blocked forever.<br>> so i write new simple code to test MPI_Send,MPI_Recv like this<br>> <br>> int myrank;<br>> MPI_Status status;<br>> MPI_Init( &argc, &argv );<br>> MPI_Comm_rank( MPI_COMM_WORLD, &myrank );<br>> if (myrank == 0) {<br>> int senddata = 1;<br>> MPI_Send(&senddata, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);<br>> }<br>> else if (myrank == 1) {<br>> int recvdata = 0;<br>> MPI_Recv(&recvdata, 1, MPI_INT, 0, 0,
MPI_COMM_WORLD, &status);<br>> printf("received :%d:\n", recvdata);<br>> }<br>> MPI_Finalize();<br>> <br>> <br>> i got this error<br>> <br>> <br>> Assertion failed in file ch3_progress.c at line 489: pkt->type >= 0 && pkt->type < MPIDI_NEM_PKT_END<br>> internal ABORT - process 1<br>> Fatal error in MPI_Finalize: Other MPI error, error stack:<br>> MPI_Finalize(315)..................: MPI_Finalize failed<br>> MPI_Finalize(207)..................: MPID_Finalize(92)..................: PMPI_Barrier(476)..................: MPI_Barrier(comm=0x44000002) failed<br>> MPIR_Barrier(82)...................: MPIC_Sendrecv(164).................: MPIC_Wait(405).....................: MPIDI_CH3I_Progress(150)...........: MPID_nem_mpich2_blocking_recv(1074): MPID_nem_tcp_connpoll(1667)........: state_commrdy_handler(1517)........: MPID_nem_tcp_recv_handler(1413)....: socket closed<br>> <br>>
////////////////////////////////////////////////////////////////<br>> <br>> I also tried example/cpi that come with install package -> result is the example program freezed, without any errors.(I assume it stopped at MPI_Bcast())<br>> <br>> Can anyone help me with this?<br>> This code and my program can run smoothly when I use 1 machine (with option , -n 2, -n 4 .... etc) but whenever I start mpdboot with 2 machines, mpi processes can't communicate with other mpi processes via MPI_Send,MPI_Recv.<br>> <br>> thx,<br>> gra<br>> <br>> <br>> <br><br>-- Pavan Balaji<br><a href="http://www.mcs.anl.gov/~balaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br></div></blockquote></td></tr></table><br>