[mpich-discuss] MPICH2-1.3.1 crash and SIGHUP issues.
Darius Buntinas
buntinas at mcs.anl.gov
Mon Jan 10 15:07:30 CST 2011
It looks like the segfault is happening in the libHYPRE library. Without knowing what this library is doing, it's hard to determine the cause. If you have the source code for that library, you can try tracking it down, otherwise try contacting the maintainers.
-d
On Jan 10, 2011, at 2:55 PM, Sunil Thomas wrote:
> Thanks for the response. Moving forward, upon further debugging of the example code resulting in the
> "APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)" error (using gdb and by
> attaching to each process), here is what I got so far:
>
> --------------------------
> 0x0000000000401ad0 in main (argc=1, argv=0x7fff12a68028) at ex5.c:57
> 57 while (DebugWait);
> (gdb) r
> The program being debugged has been started already.
> Start it from the beginning? (y or n) n
> Program not restarted.
> (gdb) set DebugWait = 0
> (gdb) s
> 61 n = 33;
> (gdb) n
> 62 solver_id = 0;
> (gdb) c
> Continuing.
> Program received signal SIGSEGV, Segmentation fault.
> 0x00002b71982477e0 in hypre_MatvecCommPkgCreate_core () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> (gdb) bt
> #0 0x00002b71982477e0 in hypre_MatvecCommPkgCreate_core () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #1 0x00002b7198247d8c in hypre_MatvecCommPkgCreate () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #2 0x00002b7198234361 in hypre_BoomerAMGCreateS () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #3 0x00002b71981f10f5 in hypre_BoomerAMGSetup () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #4 0x0000000000402421 in main (argc=1, argv=0x7fff12a68028) at ex5.c:319
> (gdb) q
> --------------------------
> 0x0000000000401ad0 in main (argc=1, argv=0x7fff4b539af8) at ex5.c:57
> 57 while (DebugWait);
> (gdb) set DebugWait = 0
> (gdb) s
> 61 n = 33;
> (gdb) n
> 62 solver_id = 0;
> (gdb) c
> Continuing.
> Program received signal SIGSEGV, Segmentation fault.
> 0x00002b8a5f727f07 in hypre_BoomerAMGCoarsen () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> (gdb) bt
> #0 0x00002b8a5f727f07 in hypre_BoomerAMGCoarsen () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #1 0x00002b8a5f72ab51 in hypre_BoomerAMGCoarsenFalgout () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #2 0x00002b8a5f721cdf in hypre_BoomerAMGSetup () from /data/rpe/sypb/devl/3rdparty/hypre-2.4.0b/lib/libHYPRE.so
> #3 0x0000000000402421 in main (argc=1, argv=0x7fff4b539af8) at ex5.c:319
> (gdb) q
> ------------------------
>
>
> Before digging any further in the 3rd party library HYPRE, does this give any useful info as to where the problem lies, in
> terms of ruling out say error with mpich2-1.3.1, etc? It seems like the problem is in the 3rd party library HYPRE (I am
> using version 2.4.0b), but I am not 100% sure.
>
> Thanks again.
> --Sunil.
>
>
>
> On Sun, Jan 9, 2011 at 6:22 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>
> Please keep mpich-discuss cc'ed.
>
> ----- Original Message -----
> > Thanks Pavan!
> >
> > No I am not. I was simply searching for the error message I got. The
> > fact
> > that the error is seen (whether using RMA or not) suggests the problem
> > could
> > still be with mpich2-1.3.1.
>
> If the application terminates (for any reason), the process manager will display this error string. These two could be (and most likely are) completely unrelated problems.
>
> -- Pavan
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list