[MPICH] caused collective abort of all ranks using mpich2-1.0.3
Steve Kargl
sgk at troutmask.apl.washington.edu
Fri Dec 22 15:23:11 CST 2006
On Sat, Dec 23, 2006 at 04:36:33AM +0800, Duan Sai wrote:
>
> I have a problem with running mpirun in my Linux server. My
> Linux server's OS is x86_64 (Redhat EL4 U8) and mpich version is
> mpich2-1.0.3. My job is about scientifical numerical integrate.
> If a use a large mesh in my integrate the job runs very well.
> However when I use a small mesh to obtain more accurate value,
> the error happened like below
>
> rank 1 in job 1 Machine caused collective abort of all ranks
> exit status of rank 1: killed by signal 11
>
> How can I solved this problem?
What compiler are you using? My first guess is an array index
is going out of bounds. See if your compiler has a bounds checking
option.
--
Steve
More information about the mpich-discuss
mailing list