[petsc-users] PetSc and CrayPat: MPI assertion errors

Vijay S Kumar vijayskumar at gmail.com
Tue Jul 6 08:30:14 CDT 2021

Hello all,

By way of background, we have a PetSc-based solver that we run on our
in-house Cray system. We are carrying out performance analysis using
profilers in the CrayPat suite that provide more fine-grained
performance-related information than the PetSc log_view summary.

When instrumented using CrayPat perftools, it turns out that the MPI
initialization (MPI_Init) internally invoked by PetscInitialize is not
picked up by the profiler. That is, simply specifying the following:
              ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr)
return ierr;
results in the following runtime error:

               CrayPat/X:  Version 7.1.1 Revision 7c0ddd79b  08/19/19

Attempting to use an MPI routine before initializing MPICH

To circumvent this, we had to explicitly call MPI_Init prior to
            ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr)
return ierr;

However, the side-effect of this above workaround seems to be several
downstream runtime (assertion) errors with VecAssemblyBegin/End and
MatAssemblyBeing/End statements:

CrayPat/X:  Version 7.1.1 Revision 7c0ddd79b  08/19/19 16:58:46
main.x: ../rtsum.c:5662: __pat_trsup_trace_waitsome_rtsum: Assertion
`recv_count != MPI_UNDEFINED' failed.

 main at main.c:769
  VecAssemblyEnd at 0x2aaab951b3ba
  VecAssemblyEnd_MPI_BTS at 0x2aaab950b179
  MPI_Waitsome at 0x43a238
  __pat_trsup_trace_waitsome_rtsum at 0x5f1a17
  __GI___assert_fail at 0x2aaabc61e7d1
  __assert_fail_base at 0x2aaabc61e759
  __GI_abort at 0x2aaabc627740
  __GI_raise at 0x2aaabc626160

Interestingly,  we do not see such errors when there is no explicit
MPI_Init, and no instrumentation for performance.
Looking for someone to help throw more light on why PetSc Mat/Vec
AssemblyEnd statements lead to such MPI-level assertion errors in cases
where MPI_Init is explicitly called.
(Or alternatively, is there a way to call PetscInitialize in a manner that
ensures that the MPI initialization is picked up by the profilers in

We would highly appreciate any help/pointers,

