[mpich-discuss] Persistent communications between 2 computers

Thierry Roudier thierry.roudier at kiastek.com
Thu Sep 8 01:46:10 CDT 2011


Hi all,

I want to use persistent communications to exchange data (below is the 
code I use). It works well on a single computer (mpiexec -n 2 
teststart). But if I want to distribute the code on 2 computers (Win7) 
by using the command line: mpiexec -hosts 2 localhost 192.168.1.12 
teststart. Unfortunately it doesn't work, and I got the following message:
> Fatal error in PMPI_Waitall: Other MPI error, error stack:
> PMPI_Waitall(274)....................: MPI_Waitall(count=2, 
> req_array=004208B0, status_array=00420880) failed
> MPIR_Waitall_impl(121)...............:
> MPIDI_CH3I_Progress(402).............:
> MPID_nem_mpich2_blocking_recv(905)...:
> MPID_nem_newtcp_module_poll(37)......:
> MPID_nem_newtcp_module_connpoll(2669):
> gen_cnting_fail_handler(1738)........: connect failed - The semaphore 
> timeout period has expired.
>  (errno 121)
>
> ****** Persistent Communications *****
> Trials=             1
> Reps/trial=      1000
> Message Size   Bandwidth (bytes/sec)
> Fatal error in PMPI_Waitall: Other MPI error, error stack:
> PMPI_Waitall(274)....................: MPI_Waitall(count=2, 
> req_array=004208B0, status_array=00420880) failed
> MPIR_Waitall_impl(121)...............:
> MPIDI_CH3I_Progress(402).............:
> MPID_nem_mpich2_blocking_recv(905)...:
> MPID_nem_newtcp_module_poll(37)......:
> MPID_nem_newtcp_module_connpoll(2655):
> gen_read_fail_handler(1145)..........: read from socket failed - Le 
> nom rÚseau spÚcifiÚ nÆest plus disponible.

And below is the code I use.

Thanks a lot for your help.

-Thierry

> #include <mpi.h>
> #include <stdio.h>
>
> /* Modify these to change timing scenario */
> #define TRIALS          1
> #define STEPS           1
> #define MAX_MSGSIZE     1048576    /* 2^STEPS */
> #define REPS            1000
> #define MAXPOINTS       10000
>
> int    numtasks, rank, tag=999, n, i, j, k, this, msgsizes[MAXPOINTS];
> double  mbytes, tbytes, results[MAXPOINTS], ttime, t1, t2;
> char   sbuff[MAX_MSGSIZE], rbuff[MAX_MSGSIZE];
> MPI_Status stats[2];
> MPI_Request reqs[2];
>
> int main(argc,argv)
> int argc;
> char *argv[];  {
>
> MPI_Init(&argc,&argv);
> MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> /**************************** task 0 ***********************************/
> if (rank == 0) {
>
>   /* Initializations */
>   n=1;
>   for (i=0; i<=STEPS; i++) {
>     msgsizes[i] = n;
>     results[i] = 0.0;
>     n=n*2;
>     }
>   for (i=0; i<MAX_MSGSIZE; i++)
>     sbuff[i] = 'x';
>
>   /* Greetings */
>   printf("\n****** Persistent Communications *****\n");
>   printf("Trials=      %8d\n",TRIALS);
>   printf("Reps/trial=  %8d\n",REPS);
>   printf("Message Size   Bandwidth (bytes/sec)\n");
>
>   /* Begin timings */
>   for (k=0; k<TRIALS; k++) {
>
>     n=1;
>     for (j=0; j<=STEPS; j++) {
>
>       /* Setup persistent requests for both the send and receive */
>       MPI_Recv_init (&rbuff, n, MPI_CHAR, 1, tag, MPI_COMM_WORLD, 
> &reqs[0]);
>       MPI_Send_init (&sbuff, n, MPI_CHAR, 1, tag, MPI_COMM_WORLD, 
> &reqs[1]);
>
>       t1 = MPI_Wtime();
>       for (i=1; i<=REPS; i++){
>         MPI_Startall (2, reqs);
>         MPI_Waitall (2, reqs, stats);
>         }
>       t2 = MPI_Wtime();
>
>       /* Compute bandwidth and save best result over all TRIALS */
>       ttime = t2 - t1;
>       tbytes = sizeof(char) * n * 2.0 * (float)REPS;
>       mbytes = tbytes/ttime;
>       if (results[j] < mbytes)
>          results[j] = mbytes;
>
>       /* Free persistent requests */
>       MPI_Request_free (&reqs[0]);
>       MPI_Request_free (&reqs[1]);
>       n=n*2;
>       }   /* end j loop */
>     }     /* end k loop */
>
>   /* Print results */
>   for (j=0; j<=STEPS; j++) {
>     printf("%9d %16d\n", msgsizes[j], (int)results[j]);
>     }
>
>   }       /* end of task 0 */
>
>
>
> /****************************  task 1 
> ************************************/
> if (rank == 1) {
>
>   /* Begin timing tests */
>   for (k=0; k<TRIALS; k++) {
>
>     n=1;
>     for (j=0; j<=STEPS; j++) {
>
>       /* Setup persistent requests for both the send and receive */
>       MPI_Recv_init (&rbuff, n, MPI_CHAR, 0, tag, MPI_COMM_WORLD, 
> &reqs[0]);
>       MPI_Send_init (&sbuff, n, MPI_CHAR, 0, tag, MPI_COMM_WORLD, 
> &reqs[1]);
>
>       for (i=1; i<=REPS; i++){
>         MPI_Startall (2, reqs);
>         MPI_Waitall (2, reqs, stats);
>         }
>
>       /* Free persistent requests */
>       MPI_Request_free (&reqs[0]);
>       MPI_Request_free (&reqs[1]);
>       n=n*2;
>
>       }   /* end j loop */
>     }     /* end k loop */
>   }       /* end task 1 */
>
>
> MPI_Finalize();
>
> }  /* end of main */



More information about the mpich-discuss mailing list