[petsc-users] cuda gpu eager initialization error cudaErrorNotSupported

Mark Lohry mlohry at gmail.com
Thu Jan 5 14:42:34 CST 2023


I'm trying to compile the cuda example

./config/examples/arch-ci-linux-cuda-double-64idx.py
--with-cudac=/usr/local/cuda-11.5/bin/nvcc

and running make test passes the test ok
diff-sys_objects_device_tests-ex1_host_with_device+nsize-1device_enable-lazy
but the eager variant fails, pasted below.

I get a similar error running my client code, pasted after. There when
running with -info, it seems that some lazy initialization happens first,
and i also call VecCreateSeqCuda which seems to have no issue.

Any idea? This happens to be with an -sm 3.5 device if it matters,
otherwise it's a recent cuda compiler+driver.


petsc test code output:



not ok
sys_objects_device_tests-ex1_host_with_device+nsize-1device_enable-eager #
Error code: 97
# [0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
# [0]PETSC ERROR: GPU error
# [0]PETSC ERROR: cuda error 801 (cudaErrorNotSupported) : operation not
supported
# [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
# [0]PETSC ERROR: Petsc Release Version 3.18.3, Dec 28, 2022
# [0]PETSC ERROR: ../ex1 on a  named lancer by mlohry Thu Jan  5 15:22:33
2023
# [0]PETSC ERROR: Configure options
--package-prefix-hash=/home/mlohry/petsc-hash-pkgs --with-make-test-np=2
--download-openmpi=1 --download-hypre=1 --download-hwloc=1 COPTFLAGS="-g
-O" FOPTFLAGS="-g -O" CXXOPTFLAGS="-g -O" --with-64-bit-indices=1
--with-cuda=1 --with-precision=double --with-clanguage=c
--with-cudac=/usr/local/cuda-11.5/bin/nvcc
PETSC_ARCH=arch-ci-linux-cuda-double-64idx
# [0]PETSC ERROR: #1 CUPMAwareMPI_() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:194
# [0]PETSC ERROR: #2 initialize() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:71
# [0]PETSC ERROR: #3 init_device_id_() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:290
# [0]PETSC ERROR: #4 getDevice() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/../impls/host/../impldevicebase.hpp:99
# [0]PETSC ERROR: #5 PetscDeviceCreate() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:104
# [0]PETSC ERROR: #6 PetscDeviceInitializeDefaultDevice_Internal() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:375
# [0]PETSC ERROR: #7 PetscDeviceInitializeTypeFromOptions_Private() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:499
# [0]PETSC ERROR: #8 PetscDeviceInitializeFromOptions_Internal() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:634
# [0]PETSC ERROR: #9 PetscInitialize_Common() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/pinit.c:1001
# [0]PETSC ERROR: #10 PetscInitialize() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/pinit.c:1267
# [0]PETSC ERROR: #11 main() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/tests/ex1.c:12
# [0]PETSC ERROR: PETSc Option Table entries:
# [0]PETSC ERROR: -default_device_type host
# [0]PETSC ERROR: -device_enable eager
# [0]PETSC ERROR: ----------------End of Error Message -------send entire
error message to petsc-maint at mcs.anl.gov----------





solver code output:



[0] <sys> PetscDetermineInitialFPTrap(): Floating point trapping is off by
default 0
[0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType
host available, initializing
[0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDevice host
initialized, default device id 0, view FALSE, init type lazy
[0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType
cuda available, initializing
[0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDevice cuda
initialized, default device id 0, view FALSE, init type lazy
[0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType
hip not available
[0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType
sycl not available
[0] <sys> PetscInitialize_Common(): PETSc successfully started: number of
processors = 1
[0] <sys> PetscGetHostName(): Rejecting domainname, likely is NIS
lancer.(none)
[0] <sys> PetscInitialize_Common(): Running on machine: lancer
# [Info] Petsc initialization complete.
# [Trace] Timing: Starting solver...
# [Info] RNG initial conditions have mean 0.000004, renormalizing.
# [Trace] Timing: PetscTimeIntegrator initialization...
# [Trace] Timing: Allocating Petsc CUDA arrays...
[0] <sys> PetscCommDuplicate(): Duplicating a communicator 2 3 max tags =
100000000
[0] <sys> configure(): Configured device 0
[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 2 3
# [Trace] Timing: Allocating Petsc CUDA arrays finished in 0.015439 seconds.
[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 2 3
[0] <sys> PetscCommDuplicate(): Duplicating a communicator 1 4 max tags =
100000000
[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4
[0] <dm> DMGetDMTS(): Creating new DMTS
[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4
[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4
[0] <dm> DMGetDMSNES(): Creating new DMSNES
[0] <dm> DMGetDMSNESWrite(): Copying DMSNES due to write
# [Info] Initializing petsc with ode23 integrator
# [Trace] Timing: PetscTimeIntegrator initialization finished in 0.016754
seconds.

[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4
[0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4
[0] <device> PetscDeviceContextSetupGlobalContext_Private(): Initializing
global PetscDeviceContext with device type cuda
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: GPU error
[0]PETSC ERROR: cuda error 801 (cudaErrorNotSupported) : operation not
supported
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.18.3, Dec 28, 2022
[0]PETSC ERROR: maDG on a arch-linux2-c-opt named lancer by mlohry Thu Jan
 5 15:39:14 2023
[0]PETSC ERROR: Configure options
PETSC_DIR=/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc
PETSC_ARCH=arch-linux2-c-opt --with-cc=/usr/bin/cc --with-cxx=/usr/bin/c++
--with-fc=0 --with-pic=1 --with-cxx-dialect=C++11 MAKEFLAGS=$MAKEFLAGS
COPTFLAGS="-O3 -march=native" CXXOPTFLAGS="-O3 -march=native" --with-mpi=0
--with-debugging=no --with-cudac=/usr/local/cuda-11.5/bin/nvcc
--with-cuda-arch=35 --with-cuda --with-cuda-dir=/usr/local/cuda-11.5/
--download-hwloc=1
[0]PETSC ERROR: #1 initialize() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cuda/../cupmcontext.hpp:255
[0]PETSC ERROR: #2 PetscDeviceContextCreate_CUDA() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cuda/
cupmcontext.cu:10
[0]PETSC ERROR: #3 PetscDeviceContextSetDevice_Private() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/dcontext.cxx:244
[0]PETSC ERROR: #4 PetscDeviceContextSetDefaultDeviceForType_Internal() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/dcontext.cxx:259
[0]PETSC ERROR: #5 PetscDeviceContextSetupGlobalContext_Private() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/global_dcontext.cxx:52
[0]PETSC ERROR: #6 PetscDeviceContextGetCurrentContext() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/global_dcontext.cxx:84
[0]PETSC ERROR: #7 PetscDeviceContextGetCurrentContextAssertType_Internal()
at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/include/petsc/private/deviceimpl.h:371
[0]PETSC ERROR: #8 PetscCUBLASGetHandle() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cuda/
cupmcontext.cu:23
[0]PETSC ERROR: #9 VecMAXPY_SeqCUDA() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/vec/vec/impls/seq/seqcuda/
veccuda2.cu:261
[0]PETSC ERROR: #10 VecMAXPY() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/vec/vec/interface/rvector.c:1221
[0]PETSC ERROR: #11 TSStep_RK() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/ts/impls/explicit/rk/rk.c:814
[0]PETSC ERROR: #12 TSStep() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/ts/interface/ts.c:3424
[0]PETSC ERROR: #13 TSSolve() at
/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/ts/interface/ts.c:3814
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230105/bf877958/attachment.html>


More information about the petsc-users mailing list