[petsc-users] Error running src/snes/tutorials/ex19 on Nvidia Tesla K40m : CUDA ERROR (code = 101, invalid device ordinal)

Juan Pablo de Lima Costa Salazar jp.salazar at pm.me
Thu Jul 14 15:35:28 CDT 2022


Thank you Barry and Stefano,

Below is the output from the example, which I ran with an added option since my mpi is not gpu aware. I believe this may be responsible for the error. The reason I chose to compile with the option

>>> --download-hypre-configure-arguments=--enable-unified-memory \

is because it was in config/examples/arch-ci-linux-cuda-pkgs.py . There are several other examples and there is no other particular reason why I chose this one, other than using hyper. I didn’t think too much about it. After recompiling without this option the example ran successfully. I will see about combining openmpi with cuda support.

Thanks!

For the sake of reference:

With the --download-hypre-configure-arguments=--enable-unified-memory option

$ mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -use_gpu_aware_mpi 0 -info > log.ex19 2>&1

[0] PetscDetermineInitialFPTrap(): Floating point trapping is off by default 0
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType cuda supported, initializing
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType hip not supported
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType sycl not supported
[0] PetscInitialize_Common(): PETSc successfully started: number of processors = 1
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] PetscInitialize_Common(): Running on machine: node021
[0] PetscCommDuplicate(): Duplicating a communicator 140679929097504 30289408 max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 30289408
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 30289408
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 30289408
[0] PetscCommDuplicate(): Duplicating a communicator 140679929096992 30157712 max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] DMGetDMSNES(): Creating new DMSNES
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] configure(): Configured device 0
lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 2500 X 2500; storage space: 0 unneeded,48400 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 20
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 2500) < 0.6. Do not useCompressedRow routines.
[0] DMGetDMKSP(): Creating new DMKSP
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] PetscDeviceContextSetupGlobalContext_Private(): Initializing global PetscDeviceContext
0 SNES Function norm 0.0406612
[0] ISColoringCreate(): Number of colors 20
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712
[0] MatFDColoringSetUp_SeqXAIJ(): ncolors 20, brows 66 and bcols 15 are used.
[0] SNESComputeJacobian(): Rebuilding preconditioner
[0] PCSetUp(): Setting up PC for first time
[0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C (seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C (seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C (seqaijcusparse) -> 1
CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139
--------------------------------------------------------------------------
Primary jobterminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[51372,1],0]
Exit code:1
--------------------------------------------------------------------------

Without the --download-hypre-configure-arguments=--enable-unified-memory option

$ mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -use_gpu_aware_mpi 0 -info > log.ex19 2>&1

[0] PetscDetermineInitialFPTrap(): Floating point trapping is off by default 0
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType cuda supported, initializing
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType hip not supported
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType sycl not supported
[0] PetscInitialize_Common(): PETSc successfully started: number of processors = 1
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] PetscInitialize_Common(): Running on machine: node021
[0] PetscCommDuplicate(): Duplicating a communicator 140322706697504 29662720 max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 29662720
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 29662720
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 29662720
[0] PetscCommDuplicate(): Duplicating a communicator 140322706696992 29531024 max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] DMGetDMSNES(): Creating new DMSNES
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] configure(): Configured device 0
lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 2500 X 2500; storage space: 0 unneeded,48400 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 20
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 2500) < 0.6. Do not use CompressedRow routines.
[0] DMGetDMKSP(): Creating new DMKSP
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] PetscDeviceContextSetupGlobalContext_Private(): Initializing global PetscDeviceContext
0 SNES Function norm 0.0406612
[0] ISColoringCreate(): Number of colors 20
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024
[0] MatFDColoringSetUp_SeqXAIJ(): ncolors 20, brows 66 and bcols 15 are used.
[0] SNESComputeJacobian(): Rebuilding preconditioner
[0] PCSetUp(): Setting up PC for first time
[0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C (seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C (seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C (seqaijcusparse) -> 1
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] KSPConvergedDefault(): Linear solver has converged. Residual norm 3.001654795047e-07 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 4.066115181565e-02 at iteration 33
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] SNESSolve_NEWTONLS(): iter=0, linear solve iterations=33
[0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 2.890238131751e+01 near zero implies inconsistent rhs
[0] PetscSplitReductionGet(): Putting reduction data in an MPI_Comm 29662720
[0] SNESLineSearchApply_BT(): Initial fnorm 4.066115181565e-02 gnorm 3.338338626166e-06
[0] SNESSolve_NEWTONLS(): fnorm=4.0661151815649638e-02, gnorm=3.3383386261659113e-06, ynorm=5.4373378910396353e-01, lssucceed=0
1 SNES Function norm 3.33834e-06
[0] SNESComputeJacobian(): Rebuilding preconditioner
[0] PCSetUp(): Setting up PC with same nonzero pattern
[0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C (seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C (seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C (seqaijcusparse) -> 1
[0] PetscCommGetComm(): Reusing a communicator 29662720 68829840
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] KSPConvergedDefault(): Linear solver has converged. Residual norm 2.753325754967e-11 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 3.338338626166e-06 at iteration 29
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged
[0] SNESSolve_NEWTONLS(): iter=1, linear solve iterations=29
[0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 3.172675080131e+01 near zero implies inconsistent rhs
[0] SNESLineSearchApply_BT(): Initial fnorm 3.338338626166e-06 gnorm 2.754150439906e-11
[0] SNESSolve_NEWTONLS(): fnorm=3.3383386261659113e-06, gnorm=2.7541504399056686e-11, ynorm=1.6805315020558734e-05, lssucceed=0
2 SNES Function norm 2.754e-11
[0] SNESConvergedDefault(): Converged due to function norm 2.754150439906e-11 < 4.066115181565e-10 (relative tolerance)
Number of SNES iterations = 2
[0] Petsc_OuterComm_Attr_Delete_Fn(): Removing reference to PETSc communicator embedded in a user MPI_Comm 29531024
[0] Petsc_InnerComm_Attr_Delete_Fn(): User MPI_Comm 140322706696992 is being unlinked from inner PETSc comm 29531024
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 29531024
[0] Petsc_Counter_Attr_Delete_Fn(): Deleting counter data in an MPI_Comm 29531024
[0] PetscFinalize(): PetscFinalize() called
[0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 29662720
[0] Petsc_OuterComm_Attr_Delete_Fn(): Removing reference to PETSc communicator embedded in a user MPI_Comm 29662720
[0] Petsc_InnerComm_Attr_Delete_Fn(): User MPI_Comm 140322706697504 is being unlinked from inner PETSc comm 29662720
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 29662720
[0] Petsc_DelReduction(): Deleting reduction data in an MPI_Comm 29662720
[0] Petsc_Counter_Attr_Delete_Fn(): Deleting counter data in an MPI_Comm 29662720

> On Jul 14, 2022, at 1:56 PM, Stefano Zampini <stefano.zampini at gmail.com> wrote:
>
> You don't need unified memory for boomeramg to work.
>
> On Thu, Jul 14, 2022, 18:55 Barry Smith <bsmith at petsc.dev> wrote:
>
>> So the PETSc test all run, including the test that uses a GPU.
>>
>> The hypre test is failing. It is impossible to tell from the output why.
>>
>> You can run it manually, cd src/snes/tutorials
>>
>> make ex19
>> mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -info > somefile
>>
>> then take a look at the output in somefile and send it to us.
>>
>> Barry
>>
>>> On Jul 14, 2022, at 12:32 PM, Juan Pablo de Lima Costa Salazar via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>
>>> Hello,
>>>
>>> I was hoping to get help regarding a runtime error I am encountering on a cluster node with 4 Tesla K40m GPUs after configuring PETSc with the following command:
>>>
>>> $./configure --force \
>>> --with-precision=double \
>>> --with-debugging=0 \
>>> --COPTFLAGS=-O3 \
>>> --CXXOPTFLAGS=-O3 \
>>> --FOPTFLAGS=-O3 \
>>> PETSC_ARCH=linux64GccDPInt32-spack \
>>> --download-fblaslapack \
>>> --download-openblas \
>>> --download-hypre \
>>> --download-hypre-configure-arguments=--enable-unified-memory \
>>> --with-mpi-dir=/opt/ohpc/pub/mpi/openmpi4-gnu9/4.0.4 \
>>> --with-cuda=1 \
>>> --download-suitesparse \
>>> --download-dir=downloads \
>>> --with-cudac=/opt/ohpc/admin/spack/0.15.0/opt/spack/linux-centos8-ivybridge/gcc-9.3.0/cuda-11.7.0-hel25vgwc7fixnvfl5ipvnh34fnskw3m/bin/nvcc \
>>> --with-packages-download-dir=downloads \
>>> --download-sowing=downloads/v1.1.26-p4.tar.gz \
>>> --with-cuda-arch=35
>>>
>>> When I run
>>>
>>> $ make PETSC_DIR=/home/juan/OpenFOAM/juan-v2206/petsc-cuda PETSC_ARCH=linux64GccDPInt32-spack check
>>> Running check examples to verify correct installation
>>> Using PETSC_DIR=/home/juan/OpenFOAM/juan-v2206/petsc-cuda and PETSC_ARCH=linux64GccDPInt32-spack
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
>>> 3,5c3,15
>>> < 1 SNES Function norm 4.12227e-06
>>> < 2 SNES Function norm 6.098e-11
>>> < Number of SNES iterations = 2
>>> ---
>>>> CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139
>>>> CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139
>>>> --------------------------------------------------------------------------
>>>> Primary job terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> --------------------------------------------------------------------------
>>>> --------------------------------------------------------------------------
>>>> mpiexec detected that one or more processes exited with non-zero status, thus causing
>>>> the job to be terminated. The first process to do so was:
>>>>
>>>> Process name: [[52712,1],0]
>>>> Exit code: 1
>>>> --------------------------------------------------------------------------
>>> /home/juan/OpenFOAM/juan-v2206/petsc-cuda/src/snes/tutorials
>>> Possible problem with ex19 running with hypre, diffs above
>>> =========================================
>>> C/C++ example src/snes/tutorials/ex19 run successfully with cuda
>>> C/C++ example src/snes/tutorials/ex19 run successfully with suitesparse
>>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
>>> Completed test examples
>>>
>>> I have compiled the code on the head node (without GPUs) and on the compute node where there are 4 GPUs.
>>>
>>> $nvidia-debugdump -l
>>> Found 4 NVIDIA devices
>>> Device ID: 0
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717032250
>>>
>>> Device ID: 1
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717031968
>>>
>>> Device ID: 2
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717032246
>>>
>>> Device ID: 3
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717032235
>>>
>>> Attached are the log files form configure and make.
>>>
>>> Any pointers are highly appreciated. My intention is to use PETSc as a linear solver for OpenFOAM, leveraging the availability of GPUs at the same time. Currently I can run PETSc without GPU support.
>>>
>>> Cheers,
>>> Juan S.
>>>
>>> <configure.log.tar.gz><make.log.tar.gz>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220714/d5717cfa/attachment-0001.html>


More information about the petsc-users mailing list