[petsc-users] Debugging MatAssemblyEnd

Dominik Szczerba dominik at itis.ethz.ch
Fri Aug 26 01:59:32 CDT 2011


I get the following crash:

Fatal error in MPI_Allreduce: Error message texts are not
available[cli_1]: aborting job:
Fatal error in MPI_Allreduce: Error message texts are not available
INTERNAL ERROR: Invalid error class (66) encountered while returning from
MPI_Allreduce.  Please file a bug report.  No error stack is available.
Fatal error in MPI_Allreduce: Error message texts are not
available[cli_0]: aborting job:
Fatal error in MPI_Allreduce: Error message texts are not available
INTERNAL ERROR: Invalid error class (66) encountered while returning from
MPI_Allreduce.  Please file a bug report.  No error stack is available.
Fatal error in MPI_Allreduce: Error message texts are not
available[cli_3]: aborting job:
Fatal error in MPI_Allreduce: Error message texts are not available

but only in one case, while a few others work as expected. Running in
debugger points to the call in my code:

	ierr = MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);

and that in turn leads into the above listed MPI_Allreduce. I did all
paranoid checks in my matrix assembly for illegal indices, all clean.
It is my only call to MatAssemblyEnd, the problem is linear, but with
several unknowns per point. A few other cases do not crash. Valgrind
(on a small case) reports something probably irrelevant (attached at
the bottom).

How should I debug this?

Thanks for any hints,
Dominik



==9521== Syscall param writev(vector[...]) points to uninitialised byte(s)
==9521==    at 0x6DD8789: writev (writev.c:56)
==9521==    by 0x517348: MPIDU_Sock_writev (sock_immed.i:610)
==9521==    by 0x519633: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==9521==    by 0x4FD446: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:509)
==9521==    by 0x4FF3D4: MPID_Isend (mpid_isend.c:118)
==9521==    by 0x4E6D1A: MPIC_Isend (helper_fns.c:210)
==9521==    by 0xEE90DD: MPIR_Alltoall (alltoall.c:420)
==9521==    by 0xEE9940: PMPI_Alltoall (alltoall.c:685)
==9521==    by 0xE5DD4D: SetUp__ (setup.c:122)
==9521==    by 0xE5E59C: PartitionSmallGraph__ (weird.c:39)
==9521==    by 0xE5B888: ParMETIS_V3_PartKway (kmetis.c:131)
==9521==    by 0x72D82C: MatPartitioningApply_Parmetis (pmetis.c:97)
==9521==    by 0x729E28: MatPartitioningApply (partition.c:236)
==9521==    by 0x52A240: PetscSolver::LoadMesh(std::string const&)
(PetscSolver.cxx:625)
==9521==    by 0x4C61AC: SM3T4_USER::ProcessInputFile() (sm3t4mpi_main.cxx:227)
==9521==    by 0x4C48F3: main (sm3t4mpi_main.cxx:665)
==9521==  Address 0xc7ebe3c is 12 bytes inside a block of size 72 alloc'd
==9521==    at 0x4C28FAC: malloc (vg_replace_malloc.c:236)
==9521==    by 0xE6E84A: GKmalloc__ (util.c:151)
==9521==    by 0xE69CC1: PreAllocateMemory__ (memory.c:38)
==9521==    by 0xE5B76B: ParMETIS_V3_PartKway (kmetis.c:116)
==9521==    by 0x72D82C: MatPartitioningApply_Parmetis (pmetis.c:97)
==9521==    by 0x729E28: MatPartitioningApply (partition.c:236)
==9521==    by 0x52A240: PetscSolver::LoadMesh(std::string const&)
(PetscSolver.cxx:625)
==9521==    by 0x4C61AC: SM3T4_USER::ProcessInputFile() (sm3t4mpi_main.cxx:227)
==9521==    by 0x4C48F3: main (sm3t4mpi_main.cxx:665)


More information about the petsc-users mailing list