[mpich-discuss] Problem sometimes when running on winxp on >=2 processes and MPE_IBCAST

Ben Tay zonexo at gmail.com
Wed May 7 10:25:20 CDT 2008


Hi,

I tried to run a mpi code which is copied from an example by the RS 6000 
book. It is supposed to broadcast and synchronize all values. When I ran 
it on my school's linux servers, there is no problem. However, if I run 
it on my own winxp, on >=2 processes, sometimes it work, other times I 
get the error:

[01:3216].....ERROR:result command received but the wait_list is empty.
[01:3216]...ERROR:unable to handle the command: "cmd=result src=1 dest=1 
tag=7 c
md_tag=3 cmd_orig=dbget ctx_key=1 value="port=1518 
description=gotchama-16e5ed i
fname=192.168.1.105 " result=DBS_SUCCESS "
[01:3216].ERROR:error closing the unknown context socket: generic socket 
failure
, error stack:
MPIDU_Sock_wait(2603): The I/O operation has been aborted because of 
either a th
read exit or an application request. (errno 995)
[01:3216]..ERROR:sock_op_close returned while unknown context is in 
state: SMPD_
IDLE

Or

[01:3308].....ERROR:result command received but the wait_list is empty.
[01:3308]...ERROR:unable to handle the command: "cmd=result src=1 dest=1 
tag=15
cmd_tag=5 cmd_orig=barrier ctx_key=0 result=DBS_SUCCESS "
[01:3308]..ERROR:sock_op_close returned while unknown context is in 
state: SMPD_
IDLE

There is no problem if I run on 1 process. If it's >=4, then the error 
happens all the time. Moreover, it's a rather simple code and so there 
shouldn't be anything wrong with it. Why is this so?

Btw, the RS 6000 book also mention a routine called MPE_IBCAST, which is 
a non-blocking version of MPI_BCAST. Is there a similar routine in MPICH2?

Thank you very much

Regards.






More information about the mpich-discuss mailing list