[MPICH] Problems with MPI_Iprobe, MPI_Recv

Wenhao Xu xuwh06 at mails.tsinghua.edu.cn
Mon Sep 24 22:06:30 CDT 2007


Hi all, 

In my program, I wrote such sentence:
	While(1){
	....
		MPI_Iprobe(MPI_ANY_SOURCE, MSG_FROM_WORKER, MPI_COMM_WORLD,
&flag, &status);
      if(flag){
        MPI_Recv(&msg_from_worker, sizeof(msg_worker_completion_t),
MPI_BYTE, status.MPI_SOURCE, MSG_FROM_WORKER, MPI_COMM_WORLD, &\
status);
      handle_completion_msg( &msg_from_worker, &computing_list, &idle_queue,
&waiting_list, &ready_queue );
      }else{
        /* do somthing */

        MPI_Recv(&msg_from_worker, sizeof(msg_worker_completion_t),
MPI_BYTE, MPI_ANY_SOURCE, MSG_FROM_WORKER, MPI_COMM_WORLD, &sta\
tus);
        handle_completion_msg( &msg_from_worker, &computing_list,
&idle_queue, &waiting_list, &ready_queue );
 

      }
     }

I got the following message when I run with command: mpiexec -n 14 ./a.out

fatal error in MPI_Recv: Internal MPI error!, error stack:
MPI_Recv(186).............................: MPI_Recv(buf=0xbff15eac,
count=24, MPI_BYTE, src=MPI_ANY_SOURCE, tag=36, MPI_COMM_WORLD,
status=0xbff15ec8) failed
MPIDI_CH3_Progress_wait(212)..............: an error occurred while handling
an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(637): 
MPIDI_CH3_Sockconn_handle_conn_event(809).: [ch3:sock] received packet of
unknown type (0)

But when I run the program with the command: mpiexec -n 9 ./a.out, I got the
other messages:
*** glibc detected *** ./checker: free(): invalid size: 0x09f7a8f8 ***
======= Backtrace: =========
/lib/libc.so.6[0x4c630a68]
/lib/libc.so.6[0x4c6317f5]
/lib/libc.so.6(malloc+0x73)[0x4c6327f4]
./a.out[0x806d515]
./ a.out [0x806f15a]
./ a.out [0x80602e8]
./ a.out [0x8097b33]
./ a.out [0x8067498]
./ a.out [0x806a987]
./ a.out [0x805da09]
./ a.out [0x804cd56]
./ a.out [0x804d4b1]
./ a.out [0x804d86f]
/lib/libc.so.6(__libc_start_main+0xdc)[0x4c5e24e4]

Finally, when I run with the programe wit the command: mpiexec -n 7 ./a.out,
the program run normally and no error occurred. 

Why did I get different error with different processes running??

Thanks in advance!

Best,
Peter




More information about the mpich-discuss mailing list