[MPICH] collective abort of all ranks
Rajeev Thakur
thakur at mcs.anl.gov
Sun Jun 10 19:17:21 CDT 2007
Can you try with some other f90 compiler?
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
> Kamaraju Kusumanchi
> Sent: Sunday, June 10, 2007 6:36 PM
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [MPICH] collective abort of all ranks
>
> > >
> > > $mpif90 -show
> > > f90
> -I/home/raju/software/compiledLibs/mpich2_1.0.5p4_gcc_4.3.0_ab
soft_8.0/include
> > >
> -p/home/raju/software/compiledLibs/mpich2_1.0.5p4_gcc_4.3.0_ab
soft_8.0/include
> > >
> -L/home/raju/software/compiledLibs/mpich2_1.0.5p4_gcc_4.3.0_ab
soft_8.0/lib
> > > -lmpichf90 -lmpichf90 -lmpich -lpthread -lrt
> > >
> > > Here f90 is absoft 8.0 fortran 90 compiler. I am using
> mpich2 1.0.5p4,
> > > compiled with absoft fortran compiler 8.0, gcc 4.3.0.
> >
> > If possbile, you may want to use gcc 4.2 instead of the
> experimental gcc
> > 4.3, just in case 4.3 is buggy...
> >
>
>
> Just an update.
>
> The problem is reproducible even when mpich2 is compiled with gcc 4.2,
> absoft 8.0.
>
> $mpif90 -show
> f90
> -I/home/raju/software/compiledLibs/mpich2_1.0.5p4_gcc_4.2.0_ab
soft_8.0/include
> -p/home/raju/software/compiledLibs/mpich2_1.0.5p4_gcc_4.2.0_ab
soft_8.0/include
> -L/home/raju/software/compiledLibs/mpich2_1.0.5p4_gcc_4.2.0_ab
soft_8.0/lib
> -lmpichf90 -lmpichf90 -lmpich -lpthread -lrt
>
> $gcc -v
> Using built-in specs.
> Target: i686-pc-linux-gnu
> Configured with: /home/raju/software/unZipped/gcc-4.2.0/configure
> --prefix=/home/raju/software/compiledSoftware/gcc_4.2.0_20070514
> Thread model: posix
> gcc version 4.2.0
>
> $mpif90 test.f90
>
> $mpiexec -l -n 4 ./a.out
> rank 3 in job 7 node1.jit.mae.cornell.edu_38952 caused collective
> abort of all ranks
> exit status of rank 3: killed by signal 11
> rank 1 in job 7 node1.jit.mae.cornell.edu_38952 caused collective
> abort of all ranks
> exit status of rank 1: killed by signal 11
>
> When I ran the program via gdb using
>
> mpiexec -gdb -l -n 4 ./a.out
>
> the code seems to segfault at line 19 when it calls func1. AFAIK there
> is no memory violation in that part.
>
> I will try to recompile the mpich2 libs with --enable-g=meminit,dbg
> and see if there is anything new to report.
>
> raju
>
>
More information about the mpich-discuss
mailing list