[petsc-users] problem with nested logging, standalone example

Zongze Yang yangzongze at gmail.com
Tue Jul 22 09:48:42 CDT 2025


Hi,
I encountered a similar issue with Firedrake when using the -log_view option with XML format on macOS. Below is the error message. The Firedrake code and the shell script used to run it are attached.

```

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------

[0]PETSC ERROR: General MPI error

[0]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer

[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!eiv8Wo1VhQz4c2L8MbDoPcg0KZ0loiWlwjI1MR6VEtFfLWTjZNV4UssfSUT-F9tKXb2GjX8Ar-YrWmBGIAY9ujQp$  for trouble shooting.

[0]PETSC ERROR: PETSc Release Version 3.23.4, unknown

[0]PETSC ERROR: test.py with 2 MPI process(es) and PETSC_ARCH arch-firedrake-default on 192.168.10.51 by zzyang Tue Jul 22 22:24:05 2025

[0]PETSC ERROR: Configure options: PETSC_ARCH=arch-firedrake-default --COPTFLAGS="-O3 -march=native -mtune=native" --CXXOPTFLAGS="-O3 -march=native -mtune=native" --FOPTFLAGS="-O3 -mtune=native" --with-c2html=0 --with-debugging=0 --with-fortran-bindings=0 --with-shared-libraries=1 --with-strict-petscerrorcode --download-cmake --download-bison --download-fftw --download-mumps-avoid-mpi-in-place --with-hdf5-dir=/opt/homebrew --with-hwloc-dir=/opt/homebrew --download-metis --download-mumps --download-netcdf --download-pnetcdf --download-ptscotch --download-scalapack --download-suitesparse --download-superlu_dist --download-slepc --with-zlib --download-hpddm --download-libpng --download-ctetgen --download-tetgen --download-triangle --download-mmg --download-parmmg --download-p4est --download-eigen --download-hypre --download-pragmatic

[0]PETSC ERROR: #1 PetscLogNestedTreePrintLine() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:289

[0]PETSC ERROR: #2 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:383

[0]PETSC ERROR: #3 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #4 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #5 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #6 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #7 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #8 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #9 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #10 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #11 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #12 PetscLogNestedTreePrint() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384

[0]PETSC ERROR: #13 PetscLogNestedTreePrintTop() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:420

[0]PETSC ERROR: #14 PetscLogHandlerView_Nested_XML() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:443

[0]PETSC ERROR: #15 PetscLogHandlerView_Nested() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/lognested.c:405

[0]PETSC ERROR: #16 PetscLogHandlerView() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/interface/loghandler.c:342

[0]PETSC ERROR: #17 PetscLogView() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:2043

[0]PETSC ERROR: #18 PetscLogViewFromOptions() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:2084

[0]PETSC ERROR: #19 PetscFinalize() at /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/objects/pinit.c:1552

PetscFinalize() failed [error code: 98]

--------------------------------------------------------------------------

prterun has exited due to process rank 0 with PID 28986 on node 192.168.10.51 exiting

improperly. There are three reasons this could occur:


1. this process did not call "init" before exiting, but others in the

job did. This can cause a job to hang indefinitely while it waits for

all processes to call "init". By rule, if one process calls "init",

then ALL processes must call "init" prior to termination.


2. this process called "init", but exited without calling "finalize".

By rule, all processes that call "init" MUST call "finalize" prior to

exiting or it will be considered an "abnormal termination"


3. this process called "MPI_Abort" or "prte_abort" and the mca

parameter prte_create_session_dirs is set to false. In this case, the

run-time cannot detect that the abort call was an abnormal

termination. Hence, the only error message you will receive is this

one.


This may have caused other processes in the application to be

terminated by signals sent by prterun (as reported here).


You can avoid this message by specifying -quiet on the prterun command

line.

--------------------------------------------------------------------------

```

Best wishes,
Zongze

From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of Klaij, Christiaan via petsc-users <petsc-users at mcs.anl.gov>
Date: Monday, July 14, 2025 at 15:58
To: Barry Smith <bsmith at petsc.dev>
Cc: PETSc users list <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] problem with nested logging, standalone example

@Junchao: yes, all with my ex2f.F90 variation on two or three cores

@Barry: it's really puzzling that you cannot reproduce. Can you try running it a dozen times in a row? And look at the report_performance.xml file? When it hangs I see some nan's, for instance here in the VecAXPY event:

               <events>
                    <event>
                        <name>VecAXPY</name>
                        <time>
                            <avgvalue>0.00610203</avgvalue>
                            <minvalue>0.</minvalue>
                            <maxvalue>0.0122041</maxvalue>
                            <minloc>1</minloc>
                            <maxloc>0</maxloc>
                        </time>
                        <ncalls>
                            <avgvalue>0.5</avgvalue>
                            <minvalue>0.</minvalue>
                            <maxvalue>1.</maxvalue>
                            <minloc>1</minloc>
                            <maxloc>0</maxloc>
                        </ncalls>
                    </event>
                    <event>
                        <name>self</name>
                        <time>
                            <value>-nan.</value>
                        </time>

This is what I did in my latest attempt on the login node of our Rocky Linux 9 cluster:
1) download petsc-3.23.4.tar.gz from the petsc website
2) ./configure -prefix=~/petsc/install --with-cxx=0 --with-debugging=0 --with-mpi-dir=/cm/shared/apps/mpich/ge/gcc/64/3.4.2
3) adjust my example to this version of petsc (file is attached)
4) make ex2f-cklaij-dbg-v2
5) mpirun -n 2 ./ex2f-cklaij-dbg-v2

So the exact versions are: petsc-3.23.4, system mpich 3.4.2, system gcc 11.5.0

________________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Friday, July 11, 2025 11:22 PM
To: Klaij, Christiaan
Cc: Junchao Zhang; PETSc users list
Subject: Re: [petsc-users] problem with nested logging, standalone example


  And yet we cannot reproduce.

  Please tell us the exact PETSc version and MPI implementation versions. And reattach your reproducing example. And exactly how you run it.


  Can you reproduce it on  an "ordinary" machine, say a Mac or Linux laptop.

  Barry

  If I could reproduce the problem here is how I would debug. I put use -start_in_debugger and then put break points in places which it seem problematic. Presumably I would end up with a hang with each MPI process in a "different place" and from that I may be able to determine how that happened.



> On Jul 11, 2025, at 7:58 AM, Klaij, Christiaan <C.Klaij at marin.nl> wrote:
>
> In summary for future reference:
> - tested 3 different machines, two at Marin, one at the national HPC
> - tested 3 different mpi implementation (intelmpi, openmpi and mpich)
> - tested openmpi in both release and debug
> - tested 2 different compilers (intel and gnu), both older and very recent versions
> - tested with the most basic config (./configure --with-cxx=0 --with-debugging=0 --download-mpich)
>
> All of these test either segfault, or hang or error-out at the call to PetscLogView.
>
> Chris
>
> ________________________________________
> From: Klaij, Christiaan <C.Klaij at marin.nl>
> Sent: Friday, July 11, 2025 10:10 AM
> To: Barry Smith; Junchao Zhang
> Cc: PETSc users list
> Subject: Re: [petsc-users] problem with nested logging, standalone example
>
> @Matt: no MPI errors indeed. I've tried with MPICH and I get the same hanging.
> @Barry: both stack traces aren't exactly the same, see a sample with MPICH below.
>
> If it cannot be reproduced at your side, I'm afraid this is another dead end. Thanks anyway, I really appreciate all your help.
>
> Chris
>
> (gdb) bt
> #0  0x000015555033bc2e in MPIDI_POSIX_mpi_release_gather_gather.constprop.0 ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #1  0x000015555033db8a in MPIDI_POSIX_mpi_allreduce_release_gather ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #2  0x000015555033e70f in MPIR_Allreduce ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #3  0x000015555033f22e in PMPI_Allreduce ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #4  0x0000155553f85d69 in MPIU_Allreduce_Count (comm=-2080374782,
>    op=1476395020, dtype=1275072547, count=1, outbuf=0x7fffffffac70,
>    inbuf=0x7fffffffac60)
>    at /home/cklaij/petsc/petsc-3.23.4/src/sys/objects/pinit.c:1839
> #5  MPIU_Allreduce_Private (inbuf=inbuf at entry=0x7fffffffac60,
>    outbuf=outbuf at entry=0x7fffffffac70, count=count at entry=1,
>    dtype=dtype at entry=1275072547, op=op at entry=1476395020, comm=-2080374782)
>    at /home/cklaij/petsc/petsc-3.23.4/src/sys/objects/pinit.c:1869
> #6  0x0000155553f33dbe in PetscPrintXMLNestedLinePerfResults (
>    viewer=viewer at entry=0x458890, name=name at entry=0x155554ef6a0d 'mbps\000',
>    value=<optimized out>, minthreshold=minthreshold at entry=0,
>    maxthreshold=maxthreshold at entry=0.01,
>    minmaxtreshold=minmaxtreshold at entry=1.05)
>    at /home/cklaij/petsc/petsc-3.23.4/src/sys/logging/handler/impls/nested/xmlviewer.c:255
>
>
> (gdb) bt
> #0  0x000015554fed3b17 in clock_gettime at GLIBC_2.2.5 () from /lib64/libc.so.6
> #1  0x0000155550b0de71 in ofi_gettime_ns ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #2  0x0000155550b0dec9 in ofi_gettime_ms ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #3  0x0000155550b2fab5 in sock_cq_sreadfrom ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #4  0x00001555505ca6f7 in MPIDI_OFI_progress ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #5  0x0000155550591fe9 in progress_test ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #6  0x00001555505924a3 in MPID_Progress_wait ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #7  0x000015555043463e in MPIR_Wait_state ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #8  0x000015555052ec49 in MPIC_Wait ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #9  0x000015555053093e in MPIC_Sendrecv ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #10 0x00001555504bf674 in MPIR_Allreduce_intra_recursive_doubling ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
> #11 0x00001555505b61de in MPIDI_OFI_mpi_finalize_hook ()
>   from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12
>
> ________________________________________
> From: Barry Smith <bsmith at petsc.dev>
> Sent: Thursday, July 10, 2025 11:10 PM
> To: Junchao Zhang
> Cc: Klaij, Christiaan; PETSc users list
> Subject: Re: [petsc-users] problem with nested logging, standalone example
>
>
>  I cannot reproduce
>
> On Jul 10, 2025, at 3:46 PM, Junchao Zhang <junchao.zhang at gmail.com> wrote:
>
> Adding -mca coll_hcoll_enable 0 didn't change anything at my end.  Strange.
>
> --Junchao Zhang
>
>
> On Thu, Jul 10, 2025 at 3:39 AM Klaij, Christiaan <C.Klaij at marin.nl<mailto:C.Klaij at marin.nl>> wrote:
> An additional clue perhaps: with the option OMPI_MCA_coll_hcoll_enable=0, the code does not hang but gives the error below.
>
> Chris
>
>
> $ mpirun -mca coll_hcoll_enable 0 -n 2 ./ex2f-cklaij-dbg -pc_type jacobi -ksp_monitor_short -ksp_gmres_cgs_refinement_type refine_always
> 0 KSP Residual norm 1.11803
> 1 KSP Residual norm 0.591608
> 2 KSP Residual norm 0.316228
> 3 KSP Residual norm < 1.e-11
> 0 KSP Residual norm 0.707107
> 1 KSP Residual norm 0.408248
> 2 KSP Residual norm < 1.e-11
> Norm of error < 1.e-12 iterations 3
> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: General MPI error
> [1]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer
> [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK43J9p4SM$ <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJjkYxsN9$> for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.22.4, Mar 01, 2025
> [1]PETSC ERROR: ./ex2f-cklaij-dbg with 2 MPI process(es) and PETSC_ARCH on login1 by cklaij Thu Jul 10 10:33:33 2025
> [1]PETSC ERROR: Configure options: --prefix=/home/cklaij/ReFRESCO/trunk/install/extLibs --with-mpi-dir=/cm/shared/apps/openmpi/gcc/5.0.6-debug --with-x=0 --with-mpe=0 --with-debugging=0 --download-superlu_dist=https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/superlu_dist-8.1.2.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4VVy6P4U$ <https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/superlu_dist-8.1.2.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJkouVHb2$> --with-blaslapack-dir=/cm/shared/apps/oneapi/2024.2.1/mkl/2024.2 --download-parmetis=https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/parmetis-4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4-9b1K84$ <https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/parmetis-4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrjo6-SP$> --download-metis=https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/metis-5.1.0-p11.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4Y9uaqiQ$ <https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/metis-5.1.0-p11.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJhCc9MRE$> --with-packages-build-dir=/home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG " COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG " FCFLAGS="-Wall -funroll-all-loops -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 -DNDEBUG" F90FLAGS="-Wall -funroll-all-loops -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 -DNDEBUG" FOPTFLAGS="-Wall -funroll-all-loops -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 -DNDEBUG"
> [1]PETSC ERROR: #1 PetscLogNestedTreePrintLine() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:289
> [1]PETSC ERROR: #2 PetscLogNestedTreePrint() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:377
> [1]PETSC ERROR: #3 PetscLogNestedTreePrint() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:384
> [1]PETSC ERROR: #4 PetscLogNestedTreePrintTop() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:420
> [1]PETSC ERROR: #5 PetscLogHandlerView_Nested_XML() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:443
> [1]PETSC ERROR: #6 PetscLogHandlerView_Nested() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/lognested.c:405
> [1]PETSC ERROR: #7 PetscLogHandlerView() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/interface/loghandler.c:342
> [1]PETSC ERROR: #8 PetscLogView() at /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/plog.c:2040
> [1]PETSC ERROR: #9 ex2f-cklaij-dbg.F90:301
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF
> Proc: [[55228,1],1]
> Errorcode: 98
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> prterun has exited due to process rank 1 with PID 0 on node login1 calling
> "abort". This may have caused other processes in the application to be
> terminated by signals sent by prterun (as reported here).
> --------------------------------------------------------------------------
>
> ________________________________________
> <image198746.png>
> dr. ir.         Christiaan       Klaij   |      senior researcher
> Research & Development   |      CFD Development
> T +31 317 49 33 44<tel:+31%20317%2049%2033%2044>         |      https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4BUEn1h8$ <https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrOqapgp$>
> <image542473.png><https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJoD4fuV7$>
> <image555176.png><https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJospHf95$>
> <image269837.png><https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrpsjB_W$>
>
>
> From: Klaij, Christiaan <C.Klaij at marin.nl<mailto:C.Klaij at marin.nl>>
> Sent: Thursday, July 10, 2025 10:15 AM
> To: Junchao Zhang
> Cc: PETSc users list
> Subject: Re: [petsc-users] problem with nested logging, standalone example
>
> Hi Junchao,
>
> Thanks for testing. I've fixed the error but unfortunately that doesn't change the behavior, the code still hangs as before, with the same stack trace...
>
> Chris
>
> ________________________________________
> From: Junchao Zhang <junchao.zhang at gmail.com<mailto:junchao.zhang at gmail.com>>
> Sent: Tuesday, July 8, 2025 10:58 PM
> To: Klaij, Christiaan
> Cc: PETSc users list
> Subject: Re: [petsc-users] problem with nested logging, standalone example
>
> Hi, Chris,
> First, I had to fix an error in your test by adding " PetscCallA(MatSetFromOptions(AA,ierr))" at line 254.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Object is in wrong state
> [0]PETSC ERROR: Mat object's type is not set: Argument # 1
> ...
> [0]PETSC ERROR: #1 MatSetValues() at /scratch/jczhang/petsc/src/mat/interface/matrix.c:1503
> [0]PETSC ERROR: #2 ex2f.F90:258
>
> Then I could ran the test without problems
> mpirun -n 2 ./ex2f -pc_type jacobi -ksp_monitor_short -ksp_gmres_cgs_refinement_type refine_always
> 0 KSP Residual norm 1.11803
> 1 KSP Residual norm 0.591608
> 2 KSP Residual norm 0.316228
> 3 KSP Residual norm < 1.e-11
> 0 KSP Residual norm 0.707107
> 1 KSP Residual norm 0.408248
> 2 KSP Residual norm < 1.e-11
> Norm of error < 1.e-12 iterations 3
>
> I used petsc-3.22.4, gcc-11.3, openmpi-5.0.6 and configured with
> ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-openmpi --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG " COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG " FCFLAGS="-Wall -funroll-all-loops -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 -DNDEBUG" F90FLAGS="-Wall -funroll-all-loops -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 -DNDEBUG" FOPTFLAGS="-Wall -funroll-all-loops -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 -DNDEBUG"
>
> Could you fix the error and retry?
>
> --Junchao Zhang
>
>
> On Sun, Jul 6, 2025 at 12:57 PM Klaij, Christiaan via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov><mailto:petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>> wrote:
> Attached is a standalone example of the issue described in the
> earlier thread "problem with nested logging". The issue appeared
> somewhere between petsc 3.19.4 and 3.23.4.
>
> The example is a variation of ../ksp/tutorials/ex2f.F90, where
> I've added the nested log viewer with one event as well as the
> solution of a small system on rank zero.
>
> When running on mulitple procs the example hangs during
> PetscLogView with the backtrace below. The configure.log is also
> attached in the hope that you can replicate the issue.
>
> Chris
>
>
> #0 0x000015554c84ea9e in mca_pml_ucx_recv (buf=0x7fffffff9e30, count=1,
> datatype=0x15554c9ef900 <ompi_mpi_2dblprec>, src=1, tag=-12,
> comm=0x7f1e30, mpi_status=0x0) at pml_ucx.c:700
> #1 0x000015554c65baff in ompi_coll_base_allreduce_intra_recursivedoubling (
> sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1,
> dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
> op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630)
> at base/coll_base_allreduce.c:247
> #2 0x000015554c6a7e40 in ompi_coll_tuned_allreduce_intra_do_this (
> sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1,
> dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
> op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630,
> algorithm=3, faninout=0, segsize=0) at coll_tuned_allreduce_decision.c:142
> #3 0x000015554c6a054f in ompi_coll_tuned_allreduce_intra_dec_fixed (
> sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1,
> dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
> op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630)
> at coll_tuned_decision_fixed.c:216
> #4 0x000015554c68e160 in mca_coll_hcoll_allreduce (sbuf=0x7fffffff9e20,
> rbuf=0x7fffffff9e30, count=1, dtype=0x15554c9ef900 <ompi_mpi_2dblprec>,
> op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaecb80)
> at coll_hcoll_ops.c:217
> #5 0x000015554c59811a in PMPI_Allreduce (sendbuf=0x7fffffff9e20,
> recvbuf=0x7fffffff9e30, count=1, datatype=0x15554c9ef900 <ompi_mpi_2dblprec>, op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30) at allreduce.c:123
> #6 0x0000155553eabede in MPIU_Allreduce_Private () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #7 0x0000155553e50d08 in PetscPrintXMLNestedLinePerfResults () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #8 0x0000155553e5123e in PetscLogNestedTreePrintLine () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #9 0x0000155553e51f3a in PetscLogNestedTreePrint () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #10 0x0000155553e51e96 in PetscLogNestedTreePrint () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #11 0x0000155553e51e96 in PetscLogNestedTreePrint () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #12 0x0000155553e52142 in PetscLogNestedTreePrintTop () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #13 0x0000155553e5257b in PetscLogHandlerView_Nested_XML () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #14 0x0000155553e4e5a0 in PetscLogHandlerView_Nested () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #15 0x0000155553e56232 in PetscLogHandlerView () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #16 0x0000155553e588c3 in PetscLogView () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #17 0x0000155553e40eb5 in petsclogview_ () from /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22
> #18 0x0000000000402c8b in MAIN__ ()
> #19 0x00000000004023df in main ()
> [cid:ii_197ebccaa1d27ee6ef21]
> dr. ir. Christiaan Klaij | senior researcher
> Research & Development | CFD Development
> T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4BUEn1h8$ <https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJhphmV4x$><https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imk4ivm_tE$>
> [Facebook]<https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkLNCvsiI$>
> [LinkedIn]<https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkrb79Ay4$>
> [YouTube]<https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkJiCoeLw$>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250722/e0bd059d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: code.tar.gz
Type: application/x-gzip
Size: 1297 bytes
Desc: code.tar.gz
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250722/e0bd059d/attachment-0001.gz>


More information about the petsc-users mailing list