[petsc-dev] cuda-memcheck finds an error on Summit

Mark Adams mfadams at lbl.gov
Sun Sep 26 12:37:52 CDT 2021


FYI, I am getting this with cuda-memcheck on Summit with CUDA 11.0.3:

jsrun -n 48 -a 6 -c 6 -g 1 -r 6 --smpiargs -gpu cuda-memcheck ../ex13-cu
-dm_plex_box_faces 4,6,12 -petscpartitioner_simple_node_grid 2,2,2
-dm_plex_box_upper 2,3,6 -petscpartitioner_simple_process_grid 2,3,6
-dm_refine 3 -dm_mat_type aijcusparse -dm_vec_type cuda -dm_view

This job runs with Kokkos and it runs with these 8 nodes with -dm_refine 2
instead of 3. And it runs with the 1 node version of this test.

Thanks,
Mark

[8]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[8]PETSC ERROR: GPU error
[8]PETSC ERROR: cuda error 715 (cudaErrorIllegalInstruction) : an illegal
instruction was encountered
[8]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[8]PETSC ERROR: Petsc Development GIT revision: v3.15.4-943-g83e1f11c26
 GIT Date: 2021-09-25 18:33:40 -0400
[8]PETSC ERROR: ../ex13-cu on a arch-summit-opt-gnu-cuda named f10n17 by
Unknown Sun Sep 26 13:29:03 2021
[8]PETSC ERROR: Configure options --with-fc=mpifort --with-cc=mpicc
--with-cxx=mpiCC --CFLAGS="-fPIC -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10
-DLANDAU_MAX_Q=4" --CXXFLAGS="-fPIC -g -DLANDAU_DIM=2
-DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4" --CUDAFLAGS="-g -Xcompiler
-DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4" --FCFLAGS="-fPIC
-g" --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --with-ssl=0 --with-batch=0
--with-mpiexec="jsrun -g1" --with-cuda=1 --with-cudac=nvcc
--with-cuda-arch=70 --download-p4est=1 --download-zlib
--with-blaslapack-lib="-L/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.1.0/netlib-lapack-3.9.1-e6vxode53ghsjrop2dfwlq77s3dvkr7t/lib64
-lblas -llapack" --with-x=0 --with-64-bit-indices=0 --with-debugging=0
PETSC_ARCH=arch-summit-opt-gnu-cuda
[8]PETSC ERROR: #1 PetscSFLinkSyncStream_CUDA() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/vec/is/sf/impls/basic/cuda/
sfcuda.cu:872
[8]PETSC ERROR: #2 PetscSFLinkSyncStreamBeforeCallMPI() at
/gpfs/alpine/csc314/scratch/adams/petsc/include/../src/vec/is/sf/impls/basic/sfpack.h:344
[8]PETSC ERROR: #3 PetscSFLinkStartRequests_MPI() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/vec/is/sf/impls/basic/sfmpi.c:40
[8]PETSC ERROR: #4 PetscSFLinkStartCommunication() at
/gpfs/alpine/csc314/scratch/adams/petsc/include/../src/vec/is/sf/impls/basic/sfpack.h:267
[8]PETSC ERROR: #5 PetscSFBcastBegin_Basic() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/vec/is/sf/impls/basic/sfbasic.c:191
[8]PETSC ERROR: #6 PetscSFBcastWithMemTypeBegin() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/vec/is/sf/interface/sf.c:1493
[8]PETSC ERROR: #7 DMGlobalToLocalBegin() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/dm/interface/dm.c:2613
[8]PETSC ERROR: #8 SNESComputeJacobian_DMLocal() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/utils/dmlocalsnes.c:119
[8]PETSC ERROR: #9 SNESComputeJacobian() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/interface/snes.c:2824
[8]PETSC ERROR: #10 SNESSolve_KSPONLY() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/impls/ksponly/ksponly.c:43
[8]PETSC ERROR: #11 SNESSolve() at
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/interface/snes.c:4769
[8]PETSC ERROR: #12 main() at ex13.c:169
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210926/04881a8e/attachment.html>


More information about the petsc-dev mailing list