[petsc-users] compilation error with latest petsc source

Barry Smith bsmith at petsc.dev
Fri Jul 19 18:58:47 CDT 2024


   It is unlikely, though, of course, possible that the problem comes from the Fortran code. Is there any way to ./configure/build the code in the same way on another system that is easier to debug for? Or with less options on Frontier? (For example without the optimization flags and the extra -lxpmem etc?) and see if it still crashes in the same way?  Frontier is very flaky.

   Barry


> On Jul 19, 2024, at 3:37 PM, Vanella, Marcos (Fed) <marcos.vanella at nist.gov> wrote:
> 
> Hi Barry, with the changes in place for my fortran calls I'm now picking up the following error running PC + gamg preconditioner and mpiaijkokkos, kokkos vec:
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal memory access
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoF5L-ds58$  and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoFnOnytYM$ 
> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [0]PETSC ERROR: The line numbers in the error traceback are not always exact.
> [0]PETSC ERROR: #1 MPI function
> [0]PETSC ERROR: #2 PetscSFLinkFinishCommunication_Default() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/impls/basic/sfmpi.c:13
> [0]PETSC ERROR: #3 PetscSFLinkFinishCommunication() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/../src/vec/is/sf/impls/basic/sfpack.h:291
> [0]PETSC ERROR: #4 PetscSFBcastEnd_Basic() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/impls/basic/sfbasic.c:373
> [0]PETSC ERROR: #5 PetscSFBcastEnd() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/sf.c:1540
> [0]PETSC ERROR: #6 VecScatterEnd_Internal() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/vscat.c:95
> [0]PETSC ERROR: #7 VecScatterEnd() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/vscat.c:1352
> [0]PETSC ERROR: #8 MatDiagonalScale_MPIAIJ() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/impls/aij/mpi/mpiaij.c:1990
> [0]PETSC ERROR: #9 MatDiagonalScale() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/interface/matrix.c:5691
> [0]PETSC ERROR: #10 MatCreateGraph_Simple_AIJ() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/impls/aij/mpi/mpiaij.c:8026
> [0]PETSC ERROR: #11 MatCreateGraph() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/interface/matrix.c:11426
> [0]PETSC ERROR: #12 PCGAMGCreateGraph_AGG() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/agg.c:663
> [0]PETSC ERROR: #13 PCGAMGCreateGraph() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/gamg.c:2041
> [0]PETSC ERROR: #14 PCSetUp_GAMG() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/gamg.c:695
> [0]PETSC ERROR: #15 PCSetUp() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/interface/precon.c:1077
> [0]PETSC ERROR: #16 KSPSetUp() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:415
> [0]PETSC ERROR: #17 KSPSolve_Private() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:826
> [0]PETSC ERROR: #18 KSPSolve() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:1073
> MPICH ERROR [Rank 0] [job id 2109802.0] [Fri Jul 19 15:31:18 2024] [frontier03726] - Abort(59) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> 
> I setup PETSc with gnu compilers like this in Frontier:
> ./configure COPTFLAGS="-O2" CXXOPTFLAGS="-O2" FOPTFLAGS="-O2" FCOPTFLAGS="-O2" HIPOPTFLAGS="-O2 --offload-arch=gfx90a" --with-debugging=1 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hip-arch=gfx908 --with-hipc=hipcc   --LIBS="-L${MPICH_DIR}/lib -lmpi ${CRAY_XPMEM_POST_LINK_OPTS} -lxpmem ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-hypre --download-suitesparse --download-cmake --force
>  
> Have you guys come across this before? Thank you for your time,
> Marcos
> 
> From: Vanella, Marcos (Fed) <marcos.vanella at nist.gov <mailto:marcos.vanella at nist.gov>>
> Sent: Friday, July 19, 2024 12:54 PM
> To: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>; Patel, Saumil Sudhir <spatel at anl.gov <mailto:spatel at anl.gov>>
> Subject: Re: [petsc-users] compilation error with latest petsc source
>  
> Thank you Barry! We'll address the change accordingly.
> M
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: Friday, July 19, 2024 12:42 PM
> To: Vanella, Marcos (Fed) <marcos.vanella at nist.gov <mailto:marcos.vanella at nist.gov>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] compilation error with latest petsc source
>  
> 
>    We made some superficial changes to the Fortran API to better support Fortran and its error checking. See the bottom of https://urldefense.us/v3/__https://petsc.org/main/changes/dev/__;!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoFtLbsQb4$ 
>  <https://urldefense.us/v3/__https://petsc.org/main/changes/dev/__;!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoFtLbsQb4$ >
>    Basically, you have to respect Fortran's pickiness about passing the correct dimension (or lack of dimension) of arguments. In the error below, you need to pass PETSC_NULL_INTEGER_ARRAY
> 
> 
> 
> 
> 
>> On Jul 19, 2024, at 12:20 PM, Vanella, Marcos (Fed) via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> 
>> This Message Is From an External Sender
>> This message came from outside your organization.
>> Hi, I did an update and compiled PETSc in Frontier with gnu compilers. When compiling my code with PETSc I see this new error pop up:
>> 
>> Building mpich_gnu_frontier
>> ftn -c -m64 -O2 -g  -std=f2018 -frecursive -ffpe-summary=none -fall-intrinsics -cpp -DGITHASH_PP=\"FDS-6.9.1-894-g0b77ae0-FireX\" -DGITDATE_PP=\""Thu Jul 11 16:05:44 2024 -0400\"" -DBUILDDATE_PP=\""Jul 19, 2024  12:13:39\""   -DWITH_PETSC -I"/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/" -I"/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc2/include"  -fopenmp ../../Source/pres.f90
>> ../../Source/pres.f90:2799:65:
>> 
>>  2799 | CALL MATCREATESEQAIJ(PETSC_COMM_SELF,ZM%NUNKH,ZM%NUNKH,NNZ_7PT_H,PETSC_NULL_INTEGER,ZM%PETSC_MZ%A_H,PETSC_IERR)
>>       |                                                                 1
>> Error: Rank mismatch in argument ‘e’ at (1) (rank-1 and scalar)
>> 
>> It seems the use of PETSC_NULL_INTEGER is causing an issue now. From the PETSc docs this entry is nnz which can be an array or NULL. Has there been any change on the API for this routine? 
>> 
>> Thanks,
>> Marcos
>> 
>> PS: I see some other erros in calls to PETSc routines, same type.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240719/a9f86420/attachment-0001.html>


More information about the petsc-users mailing list