[mpich-discuss] Problems running mpi
Dave Goodell
goodell at mcs.anl.gov
Tue Oct 6 14:29:42 CDT 2009
Also, re-reading your error message you are getting a signal 11
(SIGSEGV) on one of your processes, which means that you have a
segfault somewhere. It might be in MPICH2 somewhere (1.0.6 is very
old) but there's a good chance it's in your user code somewhere.
Upgrade your MPICH2 version and if you are still having trouble turn
on core files and see where the code is segfaulting.
-Dave
On Oct 6, 2009, at 11:29 AM, Dave Goodell wrote:
> What version of MPICH2 are you using? This error message looks like
> one from an older version of MPICH2. The current version is 1.1.1p1
> and in general should be preferred over all previous versions of
> MPICH2.
>
> -Dave
>
> On Oct 6, 2009, at 11:01 AM, Fernando Saez wrote:
>
>> Dear MPICH discussion group
>>
>> I am trying to run a MPI program, but I fail with the following
>> error:
>>
>> 1: Fatal error in MPI_Recv: Other MPI error, error stack:
>> 1: MPI_Recv(186)................: MPI_Recv(buf=0xbfdf70e8,
>> count=52, MPI_DOUBLE, src=0, tag=0, MPI_COMM_WORLD,
>> status=0xbfdf6f34) failed
>> 1: MPIDI_CH3i_Progress_wait(207): sock_wait failed
>> 1: MPIDU_Sock_wait(202).........: unexpected operating system error
>> (errno=22:(strerror() not found))
>> rank 0 in job 71 lidic01.unsl.edu.ar_39689 caused collective
>> abort of all ranks
>> exit status of rank 0: killed by signal 11
>>
>> The program ejecute very well with smaller input size, but when I
>> row the size it crashing.
>>
>> Let me know if this error sounds familiar to you and if you have
>> any suggestions for what to do here.
>>
>> Thanks,
>>
>> Fernando
>
More information about the mpich-discuss
mailing list