[mpich-discuss] I wonder if my mpdboot is the cause of problem...help me!
Gra zeus
gra_zeus at yahoo.com
Sat Jul 18 05:57:17 CDT 2009
FYI, I use C++ and mpicxx to complie files. I also tried write my code in C and used mpicc to complie - got the same result.
--- On Sat, 7/18/09, Gra zeus <gra_zeus at yahoo.com> wrote:
From: Gra zeus <gra_zeus at yahoo.com>
Subject: Re: [mpich-discuss] I wonder if my mpdboot is the cause of problem...help me!
To: mpich-discuss at mcs.anl.gov
Date: Saturday, July 18, 2009, 3:31 AM
yes, mpdtrace show both node. I use version mpich2-1.1 from download page. My OS is REHL 5. when I run my program, if MPI_Send and MPI_Recv are not used, it worked. printf or cout from both machines appear in my console normally.
--- On Fri, 7/17/09, Pavan Balaji <balaji at mcs.anl.gov> wrote:
From: Pavan Balaji <balaji at mcs.anl.gov>
Subject: Re: [mpich-discuss] I wonder if my mpdboot is the cause of problem...help me!
To: mpich-discuss at mcs.anl.gov
Date: Friday, July 17, 2009, 10:12 PM
Does mpdtrace show both nodes? Which version of MPICH2 are you using?
-- Pavan
On 07/18/2009 12:05 AM, Gra zeus wrote:
> hello,
>
> thx for answer yesterday.
>
I tested my code in one machine (with"mpiexec -n 2 ./myprog"),everything work fine - my program can use MPI_Send,MPI_Recv without any problems.
>
> today, I setup mpich2 on two machines. Both machines can communicate with others, ssh are tested on both machines, mpd work, mpdringtest work.
>
> however,when i run my program that use MPI_Send and MPI_Recv, MPI_Recv is blocked forever.
> so i write new simple code to test MPI_Send,MPI_Recv like this
>
> int myrank;
> MPI_Status status;
> MPI_Init( &argc, &argv );
> MPI_Comm_rank( MPI_COMM_WORLD, &myrank );
> if (myrank == 0) {
> int senddata = 1;
> MPI_Send(&senddata, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
> }
> else if (myrank == 1) {
> int recvdata = 0;
> MPI_Recv(&recvdata, 1, MPI_INT, 0, 0,
MPI_COMM_WORLD, &status);
> printf("received :%d:\n", recvdata);
> }
> MPI_Finalize();
>
>
> i got this error
>
>
> Assertion failed in file ch3_progress.c at line 489: pkt->type >= 0 && pkt->type < MPIDI_NEM_PKT_END
> internal ABORT - process 1
> Fatal error in MPI_Finalize: Other MPI error, error stack:
> MPI_Finalize(315)..................: MPI_Finalize failed
> MPI_Finalize(207)..................: MPID_Finalize(92)..................: PMPI_Barrier(476)..................: MPI_Barrier(comm=0x44000002) failed
> MPIR_Barrier(82)...................: MPIC_Sendrecv(164).................: MPIC_Wait(405).....................: MPIDI_CH3I_Progress(150)...........: MPID_nem_mpich2_blocking_recv(1074): MPID_nem_tcp_connpoll(1667)........: state_commrdy_handler(1517)........: MPID_nem_tcp_recv_handler(1413)....: socket closed
>
>
////////////////////////////////////////////////////////////////
>
> I also tried example/cpi that come with install package -> result is the example program freezed, without any errors.(I assume it stopped at MPI_Bcast())
>
> Can anyone help me with this?
> This code and my program can run smoothly when I use 1 machine (with option , -n 2, -n 4 .... etc) but whenever I start mpdboot with 2 machines, mpi processes can't communicate with other mpi processes via MPI_Send,MPI_Recv.
>
> thx,
> gra
>
>
>
-- Pavan Balaji
http://www.mcs.anl.gov/~balaji
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090718/94fb835c/attachment-0001.htm>
More information about the mpich-discuss
mailing list