[petsc-users] problem with nested logging, standalone example
Junchao Zhang
junchao.zhang at gmail.com
Wed Jul 23 13:55:12 CDT 2025
I think I have a fix at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8583__;!!G_uCfscf7eWS!Ygjk0LqLzq4CEWSwWlfjSXbnpYArxmsXNUsIPbxdrCzLChWKg3wAvRTDx2E_WNi8e-uL0lA5NANTbRg7Yx0Cx_HiuHtS$
Chirs and Zongze, could you try it?
Thanks!
--Junchao Zhang
On Tue, Jul 22, 2025 at 4:16 PM Barry Smith <bsmith at petsc.dev> wrote:
>
> Yippee! (maybe)
>
> On Jul 22, 2025, at 4:18 PM, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
> With Chris's example, I did reproduce the "MPI_ERR_BUFFER: invalid buffer
> pointer" on a machine. I am looking into it.
>
> Thanks.
> --Junchao Zhang
>
>
> On Tue, Jul 22, 2025 at 9:51 AM Zongze Yang <yangzongze at gmail.com> wrote:
>
>> Hi,
>> I encountered a similar issue with Firedrake when using the -log_view option
>> with XML format on macOS. Below is the error message. The Firedrake code
>> and the shell script used to run it are attached.
>>
>> ```
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: General MPI error
>> [0]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer
>> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Ygjk0LqLzq4CEWSwWlfjSXbnpYArxmsXNUsIPbxdrCzLChWKg3wAvRTDx2E_WNi8e-uL0lA5NANTbRg7Yx0Cxw97wgYm$
>> <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!eiv8Wo1VhQz4c2L8MbDoPcg0KZ0loiWlwjI1MR6VEtFfLWTjZNV4UssfSUT-F9tKXb2GjX8Ar-YrWmBGIAY9ujQp$>
>> for trouble shooting.
>> [0]PETSC ERROR: PETSc Release Version 3.23.4, unknown
>> [0]PETSC ERROR: test.py with 2 MPI process(es) and PETSC_ARCH
>> arch-firedrake-default on 192.168.10.51 by zzyang Tue Jul 22 22:24:05 2025
>> [0]PETSC ERROR: Configure options: PETSC_ARCH=arch-firedrake-default
>> --COPTFLAGS="-O3 -march=native -mtune=native" --CXXOPTFLAGS="-O3
>> -march=native -mtune=native" --FOPTFLAGS="-O3 -mtune=native"
>> --with-c2html=0 --with-debugging=0 --with-fortran-bindings=0
>> --with-shared-libraries=1 --with-strict-petscerrorcode --download-cmake
>> --download-bison --download-fftw --download-mumps-avoid-mpi-in-place
>> --with-hdf5-dir=/opt/homebrew --with-hwloc-dir=/opt/homebrew
>> --download-metis --download-mumps --download-netcdf --download-pnetcdf
>> --download-ptscotch --download-scalapack --download-suitesparse
>> --download-superlu_dist --download-slepc --with-zlib --download-hpddm
>> --download-libpng --download-ctetgen --download-tetgen --download-triangle
>> --download-mmg --download-parmmg --download-p4est --download-eigen
>> --download-hypre --download-pragmatic
>> [0]PETSC ERROR: #1 PetscLogNestedTreePrintLine() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:289
>> [0]PETSC ERROR: #2 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:383
>> [0]PETSC ERROR: #3 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #4 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #5 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #6 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #7 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #8 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #9 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #10 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #11 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #12 PetscLogNestedTreePrint() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> [0]PETSC ERROR: #13 PetscLogNestedTreePrintTop() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:420
>> [0]PETSC ERROR: #14 PetscLogHandlerView_Nested_XML() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:443
>> [0]PETSC ERROR: #15 PetscLogHandlerView_Nested() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/lognested.c:405
>> [0]PETSC ERROR: #16 PetscLogHandlerView() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/interface/loghandler.c:342
>> [0]PETSC ERROR: #17 PetscLogView() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:2043
>> [0]PETSC ERROR: #18 PetscLogViewFromOptions() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:2084
>> [0]PETSC ERROR: #19 PetscFinalize() at
>> /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/objects/pinit.c:1552
>> PetscFinalize() failed [error code: 98]
>> --------------------------------------------------------------------------
>> prterun has exited due to process rank 0 with PID 28986 on node
>> 192.168.10.51 exiting
>> improperly. There are three reasons this could occur:
>>
>> 1. this process did not call "init" before exiting, but others in the
>> job did. This can cause a job to hang indefinitely while it waits for
>> all processes to call "init". By rule, if one process calls "init",
>> then ALL processes must call "init" prior to termination.
>>
>> 2. this process called "init", but exited without calling "finalize".
>> By rule, all processes that call "init" MUST call "finalize" prior to
>> exiting or it will be considered an "abnormal termination"
>>
>> 3. this process called "MPI_Abort" or "prte_abort" and the mca
>> parameter prte_create_session_dirs is set to false. In this case, the
>> run-time cannot detect that the abort call was an abnormal
>> termination. Hence, the only error message you will receive is this
>> one.
>>
>> This may have caused other processes in the application to be
>> terminated by signals sent by prterun (as reported here).
>>
>> You can avoid this message by specifying -quiet on the prterun command
>> line.
>> --------------------------------------------------------------------------
>> ```
>>
>> Best wishes,
>> Zongze
>>
>> *From: *petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of
>> Klaij, Christiaan via petsc-users <petsc-users at mcs.anl.gov>
>> *Date: *Monday, July 14, 2025 at 15:58
>> *To: *Barry Smith <bsmith at petsc.dev>
>> *Cc: *PETSc users list <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] problem with nested logging, standalone
>> example
>>
>> @Junchao: yes, all with my ex2f.F90 variation on two or three cores
>>
>> @Barry: it's really puzzling that you cannot reproduce. Can you try
>> running it a dozen times in a row? And look at the report_performance.xml
>> file? When it hangs I see some nan's, for instance here in the VecAXPY
>> event:
>>
>> <events>
>> <event>
>> <name>VecAXPY</name>
>> <time>
>> <avgvalue>0.00610203</avgvalue>
>> <minvalue>0.</minvalue>
>> <maxvalue>0.0122041</maxvalue>
>> <minloc>1</minloc>
>> <maxloc>0</maxloc>
>> </time>
>> <ncalls>
>> <avgvalue>0.5</avgvalue>
>> <minvalue>0.</minvalue>
>> <maxvalue>1.</maxvalue>
>> <minloc>1</minloc>
>> <maxloc>0</maxloc>
>> </ncalls>
>> </event>
>> <event>
>> <name>self</name>
>> <time>
>> <value>-nan.</value>
>> </time>
>>
>> This is what I did in my latest attempt on the login node of our Rocky
>> Linux 9 cluster:
>> 1) download petsc-3.23.4.tar.gz from the petsc website
>> 2) ./configure -prefix=~/petsc/install --with-cxx=0 --with-debugging=0
>> --with-mpi-dir=/cm/shared/apps/mpich/ge/gcc/64/3.4.2
>> 3) adjust my example to this version of petsc (file is attached)
>> 4) make ex2f-cklaij-dbg-v2
>> 5) mpirun -n 2 ./ex2f-cklaij-dbg-v2
>>
>> So the exact versions are: petsc-3.23.4, system mpich 3.4.2, system gcc
>> 11.5.0
>>
>> ________________________________________
>> From: Barry Smith <bsmith at petsc.dev>
>> Sent: Friday, July 11, 2025 11:22 PM
>> To: Klaij, Christiaan
>> Cc: Junchao Zhang; PETSc users list
>> Subject: Re: [petsc-users] problem with nested logging, standalone example
>>
>>
>> And yet we cannot reproduce.
>>
>> Please tell us the exact PETSc version and MPI implementation versions.
>> And reattach your reproducing example. And exactly how you run it.
>>
>>
>> Can you reproduce it on an "ordinary" machine, say a Mac or Linux
>> laptop.
>>
>> Barry
>>
>> If I could reproduce the problem here is how I would debug. I put use
>> -start_in_debugger and then put break points in places which it seem
>> problematic. Presumably I would end up with a hang with each MPI process in
>> a "different place" and from that I may be able to determine how that
>> happened.
>>
>>
>>
>> > On Jul 11, 2025, at 7:58 AM, Klaij, Christiaan <C.Klaij at marin.nl>
>> wrote:
>> >
>> > In summary for future reference:
>> > - tested 3 different machines, two at Marin, one at the national HPC
>> > - tested 3 different mpi implementation (intelmpi, openmpi and mpich)
>> > - tested openmpi in both release and debug
>> > - tested 2 different compilers (intel and gnu), both older and very
>> recent versions
>> > - tested with the most basic config (./configure --with-cxx=0
>> --with-debugging=0 --download-mpich)
>> >
>> > All of these test either segfault, or hang or error-out at the call to
>> PetscLogView.
>> >
>> > Chris
>> >
>> > ________________________________________
>> > From: Klaij, Christiaan <C.Klaij at marin.nl>
>> > Sent: Friday, July 11, 2025 10:10 AM
>> > To: Barry Smith; Junchao Zhang
>> > Cc: PETSc users list
>> > Subject: Re: [petsc-users] problem with nested logging, standalone
>> example
>> >
>> > @Matt: no MPI errors indeed. I've tried with MPICH and I get the same
>> hanging.
>> > @Barry: both stack traces aren't exactly the same, see a sample with
>> MPICH below.
>> >
>> > If it cannot be reproduced at your side, I'm afraid this is another
>> dead end. Thanks anyway, I really appreciate all your help.
>> >
>> > Chris
>> >
>> > (gdb) bt
>> > #0 0x000015555033bc2e in
>> MPIDI_POSIX_mpi_release_gather_gather.constprop.0 ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #1 0x000015555033db8a in MPIDI_POSIX_mpi_allreduce_release_gather ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #2 0x000015555033e70f in MPIR_Allreduce ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #3 0x000015555033f22e in PMPI_Allreduce ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #4 0x0000155553f85d69 in MPIU_Allreduce_Count (comm=-2080374782,
>> > op=1476395020, dtype=1275072547, count=1, outbuf=0x7fffffffac70,
>> > inbuf=0x7fffffffac60)
>> > at /home/cklaij/petsc/petsc-3.23.4/src/sys/objects/pinit.c:1839
>> > #5 MPIU_Allreduce_Private (inbuf=inbuf at entry=0x7fffffffac60,
>> > outbuf=outbuf at entry=0x7fffffffac70, count=count at entry=1,
>> > dtype=dtype at entry=1275072547, op=op at entry=1476395020,
>> comm=-2080374782)
>> > at /home/cklaij/petsc/petsc-3.23.4/src/sys/objects/pinit.c:1869
>> > #6 0x0000155553f33dbe in PetscPrintXMLNestedLinePerfResults (
>> > viewer=viewer at entry=0x458890, name=name at entry=0x155554ef6a0d
>> 'mbps\000',
>> > value=<optimized out>, minthreshold=minthreshold at entry=0,
>> > maxthreshold=maxthreshold at entry=0.01,
>> > minmaxtreshold=minmaxtreshold at entry=1.05)
>> > at
>> /home/cklaij/petsc/petsc-3.23.4/src/sys/logging/handler/impls/nested/xmlviewer.c:255
>> >
>> >
>> > (gdb) bt
>> > #0 0x000015554fed3b17 in clock_gettime at GLIBC_2.2.5 () from
>> /lib64/libc.so.6
>> > #1 0x0000155550b0de71 in ofi_gettime_ns ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #2 0x0000155550b0dec9 in ofi_gettime_ms ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #3 0x0000155550b2fab5 in sock_cq_sreadfrom ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #4 0x00001555505ca6f7 in MPIDI_OFI_progress ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #5 0x0000155550591fe9 in progress_test ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #6 0x00001555505924a3 in MPID_Progress_wait ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #7 0x000015555043463e in MPIR_Wait_state ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #8 0x000015555052ec49 in MPIC_Wait ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #9 0x000015555053093e in MPIC_Sendrecv ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #10 0x00001555504bf674 in MPIR_Allreduce_intra_recursive_doubling ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> > #11 0x00001555505b61de in MPIDI_OFI_mpi_finalize_hook ()
>> > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>> >
>> > ________________________________________
>> > From: Barry Smith <bsmith at petsc.dev>
>> > Sent: Thursday, July 10, 2025 11:10 PM
>> > To: Junchao Zhang
>> > Cc: Klaij, Christiaan; PETSc users list
>> > Subject: Re: [petsc-users] problem with nested logging, standalone
>> example
>> >
>> >
>> > I cannot reproduce
>> >
>> > On Jul 10, 2025, at 3:46 PM, Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>> >
>> > Adding -mca coll_hcoll_enable 0 didn't change anything at my end.
>> Strange.
>> >
>> > --Junchao Zhang
>> >
>> >
>> > On Thu, Jul 10, 2025 at 3:39 AM Klaij, Christiaan <C.Klaij at marin.nl
>> <mailto:C.Klaij at marin.nl>> wrote:
>> > An additional clue perhaps: with the option
>> OMPI_MCA_coll_hcoll_enable=0, the code does not hang but gives the error
>> below.
>> >
>> > Chris
>> >
>> >
>> > $ mpirun -mca coll_hcoll_enable 0 -n 2 ./ex2f-cklaij-dbg -pc_type
>> jacobi -ksp_monitor_short -ksp_gmres_cgs_refinement_type refine_always
>> > 0 KSP Residual norm 1.11803
>> > 1 KSP Residual norm 0.591608
>> > 2 KSP Residual norm 0.316228
>> > 3 KSP Residual norm < 1.e-11
>> > 0 KSP Residual norm 0.707107
>> > 1 KSP Residual norm 0.408248
>> > 2 KSP Residual norm < 1.e-11
>> > Norm of error < 1.e-12 iterations 3
>> > [1]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [1]PETSC ERROR: General MPI error
>> > [1]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer
>> > [1]PETSC ERROR: See
>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK43J9p4SM$
>> <
>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJjkYxsN9$>
>> for trouble shooting.
>> > [1]PETSC ERROR: Petsc Release Version 3.22.4, Mar 01, 2025
>> > [1]PETSC ERROR: ./ex2f-cklaij-dbg with 2 MPI process(es) and PETSC_ARCH
>> on login1 by cklaij Thu Jul 10 10:33:33 2025
>> > [1]PETSC ERROR: Configure options:
>> --prefix=/home/cklaij/ReFRESCO/trunk/install/extLibs
>> --with-mpi-dir=/cm/shared/apps/openmpi/gcc/5.0.6-debug --with-x=0
>> --with-mpe=0 --with-debugging=0 --download-superlu_dist=
>> https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/superlu_dist-8.1.2.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4VVy6P4U$
>> <
>> https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/superlu_dist-8.1.2.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJkouVHb2$>
>> --with-blaslapack-dir=/cm/shared/apps/oneapi/2024.2.1/mkl/2024.2
>> --download-parmetis=
>> https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/parmetis-4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4-9b1K84$
>> <
>> https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/parmetis-4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrjo6-SP$>
>> --download-metis=
>> https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/metis-5.1.0-p11.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4Y9uaqiQ$
>> <
>> https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/metis-5.1.0-p11.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJhCc9MRE$>
>> --with-packages-build-dir=/home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild
>> --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall
>> -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall
>> -funroll-all-loops -O3 -DNDEBUG " COPTFLAGS="-std=gnu11 -Wall
>> -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall
>> -funroll-all-loops -O3 -DNDEBUG " FCFLAGS="-Wall -funroll-all-loops
>> -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime
>> -Wno-unused-function -O3 -DNDEBUG" F90FLAGS="-Wall -funroll-all-loops
>> -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime
>> -Wno-unused-function -O3 -DNDEBUG" FOPTFLAGS="-Wall -funroll-all-loops
>> -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime
>> -Wno-unused-function -O3 -DNDEBUG"
>> > [1]PETSC ERROR: #1 PetscLogNestedTreePrintLine() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:289
>> > [1]PETSC ERROR: #2 PetscLogNestedTreePrint() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:377
>> > [1]PETSC ERROR: #3 PetscLogNestedTreePrint() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:384
>> > [1]PETSC ERROR: #4 PetscLogNestedTreePrintTop() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:420
>> > [1]PETSC ERROR: #5 PetscLogHandlerView_Nested_XML() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:443
>> > [1]PETSC ERROR: #6 PetscLogHandlerView_Nested() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/lognested.c:405
>> > [1]PETSC ERROR: #7 PetscLogHandlerView() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/interface/loghandler.c:342
>> > [1]PETSC ERROR: #8 PetscLogView() at
>> /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/plog.c:2040
>> > [1]PETSC ERROR: #9 ex2f-cklaij-dbg.F90:301
>> >
>> --------------------------------------------------------------------------
>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF
>> > Proc: [[55228,1],1]
>> > Errorcode: 98
>> >
>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> > You may or may not see output from other processes, depending on
>> > exactly when Open MPI kills them.
>> >
>> --------------------------------------------------------------------------
>> >
>> --------------------------------------------------------------------------
>> > prterun has exited due to process rank 1 with PID 0 on node login1
>> calling
>> > "abort". This may have caused other processes in the application to be
>> > terminated by signals sent by prterun (as reported here).
>> >
>> --------------------------------------------------------------------------
>> >
>> > ________________________________________
>> > <image198746.png>
>> > dr. ir. Christiaan Klaij | senior researcher
>> > Research & Development | CFD Development
>> > T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> |
>> https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4BUEn1h8$
>> <
>> https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrOqapgp$
>> >
>> > <image542473.png><
>> https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJoD4fuV7$
>> >
>> > <image555176.png><
>> https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJospHf95$
>> >
>> > <image269837.png><
>> https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrpsjB_W$
>> >
>> >
>> >
>> > From: Klaij, Christiaan <C.Klaij at marin.nl<mailto:C.Klaij at marin.nl>>
>> > Sent: Thursday, July 10, 2025 10:15 AM
>> > To: Junchao Zhang
>> > Cc: PETSc users list
>> > Subject: Re: [petsc-users] problem with nested logging, standalone
>> example
>> >
>> > Hi Junchao,
>> >
>> > Thanks for testing. I've fixed the error but unfortunately that doesn't
>> change the behavior, the code still hangs as before, with the same stack
>> trace...
>> >
>> > Chris
>> >
>> > ________________________________________
>> > From: Junchao Zhang <junchao.zhang at gmail.com<mailto:
>> junchao.zhang at gmail.com>>
>> > Sent: Tuesday, July 8, 2025 10:58 PM
>> > To: Klaij, Christiaan
>> > Cc: PETSc users list
>> > Subject: Re: [petsc-users] problem with nested logging, standalone
>> example
>> >
>> > Hi, Chris,
>> > First, I had to fix an error in your test by adding "
>> PetscCallA(MatSetFromOptions(AA,ierr))" at line 254.
>> > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [0]PETSC ERROR: Object is in wrong state
>> > [0]PETSC ERROR: Mat object's type is not set: Argument # 1
>> > ...
>> > [0]PETSC ERROR: #1 MatSetValues() at
>> /scratch/jczhang/petsc/src/mat/interface/matrix.c:1503
>> > [0]PETSC ERROR: #2 ex2f.F90:258
>> >
>> > Then I could ran the test without problems
>> > mpirun -n 2 ./ex2f -pc_type jacobi -ksp_monitor_short
>> -ksp_gmres_cgs_refinement_type refine_always
>> > 0 KSP Residual norm 1.11803
>> > 1 KSP Residual norm 0.591608
>> > 2 KSP Residual norm 0.316228
>> > 3 KSP Residual norm < 1.e-11
>> > 0 KSP Residual norm 0.707107
>> > 1 KSP Residual norm 0.408248
>> > 2 KSP Residual norm < 1.e-11
>> > Norm of error < 1.e-12 iterations 3
>> >
>> > I used petsc-3.22.4, gcc-11.3, openmpi-5.0.6 and configured with
>> > ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran
>> --download-openmpi --with-ssl=0 --with-shared-libraries=1
>> CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG"
>> CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG "
>> COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG"
>> CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG "
>> FCFLAGS="-Wall -funroll-all-loops -ffree-line-length-0
>> -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3
>> -DNDEBUG" F90FLAGS="-Wall -funroll-all-loops -ffree-line-length-0
>> -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3
>> -DNDEBUG" FOPTFLAGS="-Wall -funroll-all-loops -ffree-line-length-0
>> -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3
>> -DNDEBUG"
>> >
>> > Could you fix the error and retry?
>> >
>> > --Junchao Zhang
>> >
>> >
>> > On Sun, Jul 6, 2025 at 12:57 PM Klaij, Christiaan via petsc-users <
>> petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov><mailto:
>> petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>> wrote:
>> > Attached is a standalone example of the issue described in the
>> > earlier thread "problem with nested logging". The issue appeared
>> > somewhere between petsc 3.19.4 and 3.23.4.
>> >
>> > The example is a variation of ../ksp/tutorials/ex2f.F90, where
>> > I've added the nested log viewer with one event as well as the
>> > solution of a small system on rank zero.
>> >
>> > When running on mulitple procs the example hangs during
>> > PetscLogView with the backtrace below. The configure.log is also
>> > attached in the hope that you can replicate the issue.
>> >
>> > Chris
>> >
>> >
>> > #0 0x000015554c84ea9e in mca_pml_ucx_recv (buf=0x7fffffff9e30, count=1,
>> > datatype=0x15554c9ef900 <ompi_mpi_2dblprec>, src=1, tag=-12,
>> > comm=0x7f1e30, mpi_status=0x0) at pml_ucx.c:700
>> > #1 0x000015554c65baff in
>> ompi_coll_base_allreduce_intra_recursivedoubling (
>> > sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1,
>> > dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
>> > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630)
>> > at base/coll_base_allreduce.c:247
>> > #2 0x000015554c6a7e40 in ompi_coll_tuned_allreduce_intra_do_this (
>> > sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1,
>> > dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
>> > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630,
>> > algorithm=3, faninout=0, segsize=0) at
>> coll_tuned_allreduce_decision.c:142
>> > #3 0x000015554c6a054f in ompi_coll_tuned_allreduce_intra_dec_fixed (
>> > sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1,
>> > dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
>> > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630)
>> > at coll_tuned_decision_fixed.c:216
>> > #4 0x000015554c68e160 in mca_coll_hcoll_allreduce (sbuf=0x7fffffff9e20,
>> > rbuf=0x7fffffff9e30, count=1, dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
>> > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaecb80)
>> > at coll_hcoll_ops.c:217
>> > #5 0x000015554c59811a in PMPI_Allreduce (sendbuf=0x7fffffff9e20,
>> > recvbuf=0x7fffffff9e30, count=1, datatype=0x15554c9ef900
>> <ompi_mpi_2dblprec>, op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30)
>> at allreduce.c:123
>> > #6 0x0000155553eabede in MPIU_Allreduce_Private () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #7 0x0000155553e50d08 in PetscPrintXMLNestedLinePerfResults () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #8 0x0000155553e5123e in PetscLogNestedTreePrintLine () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #9 0x0000155553e51f3a in PetscLogNestedTreePrint () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #10 0x0000155553e51e96 in PetscLogNestedTreePrint () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #11 0x0000155553e51e96 in PetscLogNestedTreePrint () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #12 0x0000155553e52142 in PetscLogNestedTreePrintTop () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #13 0x0000155553e5257b in PetscLogHandlerView_Nested_XML () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #14 0x0000155553e4e5a0 in PetscLogHandlerView_Nested () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #15 0x0000155553e56232 in PetscLogHandlerView () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #16 0x0000155553e588c3 in PetscLogView () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #17 0x0000155553e40eb5 in petsclogview_ () from
>> /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
>> > #18 0x0000000000402c8b in MAIN__ ()
>> > #19 0x00000000004023df in main ()
>> > [cid:ii_197ebccaa1d27ee6ef21]
>> > dr. ir. Christiaan Klaij | senior researcher
>> > Research & Development | CFD Development
>> > T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> |
>> https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4BUEn1h8$
>> <
>> https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJhphmV4x$><https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imk4ivm_tE$
>> >
>> > [Facebook]<
>> https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkLNCvsiI$
>> >
>> > [LinkedIn]<
>> https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkrb79Ay4$
>> >
>> > [YouTube]<
>> https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkJiCoeLw$
>> >
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250723/1d112735/attachment-0001.html>
More information about the petsc-users
mailing list