<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jul 6, 2021, at 8:30 AM, Vijay S Kumar <<a href="mailto:vijayskumar@gmail.com" class="">vijayskumar@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hello all,<div class=""><br class=""></div><div class="">By way of background, we have a PetSc-based solver that we run on our in-house Cray system. We are carrying out performance analysis using profilers in the CrayPat suite that provide more fine-grained performance-related information than the PetSc log_view summary.</div><div class=""><br class=""></div><div class="">When instrumented using CrayPat perftools, it turns out that the MPI initialization (MPI_Init) internally invoked by PetscInitialize is not picked up by the profiler. That is, simply specifying the following:</div><div class=""> ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr) return ierr;<br class=""></div><div class="">results in the following runtime error: <br class=""></div><div class=""><div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class=""> CrayPat/X: Version 7.1.1 Revision 7c0ddd79b 08/19/19 16:58:46</div><div style="text-indent: 0.5in; margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span style="background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial;background-color:rgb(255,255,255)" class="">Attempting to use an MPI routine before initializing
MPICH</span></div></div></div></div></blockquote><div><br class=""></div> This is certainly unexpected behavior, PETSc is "just" an MPI application it does not do anything special for CrayPat. We do not expect that one would need to call MPI_Init() outside of PETSc to use a performance tool. Perhaps PETSc is not being configured/compiled with the correct flags for the CrayPat performance tools or its shared library is not being built appropriately. If CrayPat uses the PMPI_xxx wrapper model for MPI profiling it may cause these kinds of difficulties if the correct profile wrapper functions are not inserted during the build process.</div><div><br class=""></div><div> I would try running a standard PETSc program in a debugger with breakpoints for MPI_Init() (and possible others) to investigate what is happening exactly and maybe why. </div><div><br class=""></div><div> You can send to <a href="mailto:petsc-maint@mcs.anl.gov" class="">petsc-maint@mcs.anl.gov</a> the configure.log and make.log that was generated.</div><div><br class=""></div><div> Barry</div><div><br class=""></div><div><br class=""></div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class=""><br class=""></div><div class="">To circumvent this, we had to explicitly call MPI_Init prior to PetscInitialize:</div><div class=""> MPI_Init(&argc,&argv);<br class=""> ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr) return ierr;<br class=""></div><div class=""><br class=""></div><div class="">However, the side-effect of this above workaround seems to be several downstream runtime (assertion) errors with VecAssemblyBegin/End and MatAssemblyBeing/End statements:</div><div class=""><br class=""></div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px" class=""><div class=""><div class="">CrayPat/X: Version 7.1.1 Revision 7c0ddd79b 08/19/19 16:58:46</div></div><div class="">main.x: ../rtsum.c:5662: __pat_trsup_trace_waitsome_rtsum: Assertion `recv_count != MPI_UNDEFINED' failed.<br class=""></div><div class=""><div class=""><br class=""></div><div class=""> <a href="mailto:main@main.c" class="">main@main.c</a>:769<br class=""></div></div><div class=""><div class=""> <span style="background-color:rgb(255,255,0)" class="">VecAssemblyEnd@0x2aaab951b3ba</span></div></div><div class=""><div class=""><span style="background-color:rgb(255,255,0)" class=""> VecAssemblyEnd_MPI_BTS@0x2aaab950b179</span></div></div><div class=""><div class=""><span style="background-color:rgb(255,255,0)" class=""> MPI_Waitsome@0x43a238</span></div></div><div class=""><div class=""><span style="background-color:rgb(255,255,0)" class=""> __pat_trsup_trace_waitsome_rtsum@0x5f1a17</span></div></div><div class=""><div class=""><span style="background-color:rgb(255,255,0)" class=""> __GI___assert_fail@0x2aaabc61e7d1</span></div></div><div class=""><div class=""><span style="background-color:rgb(255,255,0)" class=""> __assert_fail_base@0x2aaabc61e759</span></div></div><div class=""><div class=""> __GI_abort@0x2aaabc627740</div></div><div class=""><div class=""> __GI_raise@0x2aaabc626160</div></div></blockquote><div class=""><div class=""><br class=""></div></div><div class="">Interestingly, we do not see such errors when there is no explicit MPI_Init, and no instrumentation for performance.</div><div class="">Looking for someone to help throw more light on why PetSc Mat/Vec AssemblyEnd statements lead to such MPI-level assertion errors in cases where MPI_Init is explicitly called. <br class=""></div><div class="">(Or alternatively, is there a way to call PetscInitialize in a manner that ensures that the MPI initialization is picked up by the profilers in question?)</div><div class=""><br class=""></div><div class="">We would highly appreciate any help/pointers,</div><div class=""><br class=""></div><div class="">Thanks!</div><div class=""> Vijay</div></div>
</div></blockquote></div><br class=""></body></html>