[mpich-discuss] I wonder if my mpdboot is the cause ofproblem...help me!

Gra zeus gra_zeus at yahoo.com
Sat Jul 18 10:26:43 CDT 2009


one of them is quad core and another one is dual core. however, OS,account,my password,install path are all the same.I use this  configuration "./configure --prefix=/opt/localhomes/myname/mpich2-install" in both machines.
--- On Sat, 7/18/09, Rajeev Thakur <thakur at mcs.anl.gov> wrote:

From: Rajeev Thakur <thakur at mcs.anl.gov>
Subject: Re: [mpich-discuss] I wonder if my mpdboot is the cause ofproblem...help me!
To: mpich-discuss at mcs.anl.gov
Date: Saturday, July 18, 2009, 7:02 AM



 
 
 
What are the exact parameters you passed to configure when 
building MPICH2? Are the two machines identical?
 
Rajeev


  
  
  From: mpich-discuss-bounces at mcs.anl.gov 
  [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Gra 
  zeus
Sent: Saturday, July 18, 2009 12:06 AM
To: 
  mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] I wonder if my 
  mpdboot is the cause ofproblem...help me!


  
  
    
    
      hello,
        

        thx for answer yesterday.
        I tested my code in one machine (with"mpiexec -n 2 
        ./myprog"),everything work fine - my program can use MPI_Send,MPI_Recv 
        without any problems.
        

        today, I setup mpich2 on two machines. Both machines can 
        communicate with others, ssh are tested on both machines, mpd work, 
        mpdringtest work.
        

        however,when i run my program that use MPI_Send and MPI_Recv, 
         MPI_Recv is blocked forever.
        so i write new simple code to test MPI_Send,MPI_Recv like 
this
        

                int myrank;
        
                MPI_Status status;
        MPI_Init( &argc, &argv );
        MPI_Comm_rank( MPI_COMM_WORLD, &myrank 
        );
        if 
        (myrank == 0) 
        {
        int 
        senddata = 1;
        MPI_Send(&senddata, 1, MPI_INT, 1, 0, 
        MPI_COMM_WORLD);
        }
        else if 
        (myrank == 1) 
        {
        int 
        recvdata = 0;
        MPI_Recv(&recvdata, 1, MPI_INT, 0, 0, 
        MPI_COMM_WORLD, &status);
        printf("received :%d:\n", recvdata);
        }
        MPI_Finalize();
        

        

        i got this error
        

        

        
        Assertion failed in file ch3_progress.c at line 489: pkt->type 
        >= 0 && pkt->type < MPIDI_NEM_PKT_END
        internal ABORT - process 1
        Fatal error in MPI_Finalize: Other MPI error, error stack:
        MPI_Finalize(315)..................: MPI_Finalize failed
        MPI_Finalize(207)..................: 
        MPID_Finalize(92)..................: 
        PMPI_Barrier(476)..................: MPI_Barrier(comm=0x44000002) 
        failed
        MPIR_Barrier(82)...................: 
        MPIC_Sendrecv(164).................: 
        MPIC_Wait(405).....................: 
        MPIDI_CH3I_Progress(150)...........: 
        MPID_nem_mpich2_blocking_recv(1074): 
        MPID_nem_tcp_connpoll(1667)........: 
        state_commrdy_handler(1517)........: 
        MPID_nem_tcp_recv_handler(1413)....: socket closed
        

        ////////////////////////////////////////////////////////////////
        

        I also tried example/cpi that come with install package -> 
        result is the example program freezed, without any errors.(I assume it 
        stopped at MPI_Bcast())
        

        Can anyone help me with this?
        This code and my program can run smoothly when I use 1 machine 
        (with option ,  -n 2, -n 4 .... etc) but whenever I start mpdboot 
        with 2 machines, mpi processes can't communicate with other mpi 
        processes via MPI_Send,MPI_Recv.
        

        thx,
        gra
        

        

 



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090718/e2d9d67f/attachment.htm>


More information about the mpich-discuss mailing list