[petsc-users] configure error

Barry Smith bsmith at petsc.dev
Sun May 16 22:14:55 CDT 2021


Could still be a gencode arch issue. Is it possible that Kokkos was built with the 80 arch and when you reran configure with 70 it did not rebuild Kokkos because it didn't know it needed to?

Sorry, but this may require another rm -rf arch* and running ./configure again.

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg3f51e3575c2178246db0a94a430e0038b6af535e7e53d3f21e2437e8977b8c2e <https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg3f51e3575c2178246db0a94a430e0038b6af535e7e53d3f21e2437e8977b8c2e>


cudaErrorInvalidDeviceFunction = 98
The requested device function does not exist or is not compiled for the proper device architecture.



> On May 16, 2021, at 9:09 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> I now get this error. A blas error from VecAXPBYPCZ ...
> Any ideas?
> 
> 
> terminate called after throwing an instance of 'std::runtime_error'
>   what():  cudaFuncGetAttributes(&attr_tmp, base_t::get_kernel_func()) error( cudaErrorInvalidDeviceFunction): invalid device function /global/u2/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/include/Cuda/Kokkos_Cuda_KernelLaunch.hpp:654
> Traceback functionality not available
> 
> [cgpu16:55192] *** Process received signal ***
> [cgpu16:55192] Signal: Aborted (6)
> [cgpu16:55192] Signal code:  (-6)
> [cgpu16:55192] [ 0] /lib64/libpthread.so.0(+0x12360)[0x2aab12445360]
> [cgpu16:55192] [ 1] /lib64/libc.so.6(gsignal+0x110)[0x2aab12687160]
> [cgpu16:55192] [ 2] /lib64/libc.so.6(abort+0x151)[0x2aab12688741]
> [cgpu16:55192] [ 3] /usr/common/software/sles15_cgpu/gcc/8.3.0/lib64/libstdc++.so.6(+0x93e83)[0x2aab10cb0e83]
> [cgpu16:55192] [ 4] /usr/common/software/sles15_cgpu/gcc/8.3.0/lib64/libstdc++.so.6(+0x99de6)[0x2aab10cb6de6]
> [cgpu16:55192] [ 5] /usr/common/software/sles15_cgpu/gcc/8.3.0/lib64/libstdc++.so.6(+0x99e21)[0x2aab10cb6e21]
> [cgpu16:55192] [ 6] /usr/common/software/sles15_cgpu/gcc/8.3.0/lib64/libstdc++.so.6(+0x9a053)[0x2aab10cb7053]
> [cgpu16:55192] [ 7] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libkokkoscore.so.3.4(+0x26a7f)[0x2aaabbcb3a7f]
> [cgpu16:55192] [ 8] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libkokkoscore.so.3.4(_ZN6Kokkos4Impl25cuda_internal_error_throwE9cudaErrorPKcS3_i+0x29d)[0x2aaabbcdab9d]
> [cgpu16:55192] [ 9] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libkokkoskernels.so(_ZN10KokkosBlas4Impl16V_Update_GenericIN6Kokkos4ViewIPKdJNS2_10LayoutLeftENS2_6DeviceINS2_4CudaENS2_9CudaSpaceEEENS2_12MemoryTraitsILj1EEEEEESD_NS3_IPdJS6_SA_SC_EEEiEEvRKNT_20non_const_value_typeERKSG_RKNT0_20non_const_value_typeERKSM_RKNT1_20non_const_value_typeERKSS_iii+0x3357)[0x2aaaae7108a7]
> [cgpu16:55192] [10] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libkokkoskernels.so(_ZN10KokkosBlas4Impl6UpdateIN6Kokkos4ViewIPKdJNS2_10LayoutLeftENS2_6DeviceINS2_4CudaENS2_9CudaSpaceEEENS2_12MemoryTraitsILj1EEEEEESD_NS3_IPdJS6_SA_SC_EEELi1ELb0ELb1EE6updateERS4_RKSD_SH_SJ_SH_RKSF_+0xc1)[0x2aaaae7171a1]
> [cgpu16:55192] [11] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(_ZN10KokkosBlas6updateIN6Kokkos4ViewIPKdJNS1_9CudaSpaceEEEES6_NS2_IPdJS5_EEEEEvRKNT_20non_const_value_typeERKS9_RKNT0_20non_const_value_typeERKSF_RKNT1_20non_const_value_typeERKSL_+0x271)[0x2aaaab76d781]
> [cgpu16:55192] [12] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(+0xa9333b)[0x2aaaab76633b]
> [cgpu16:55192] [13] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(VecAXPBYPCZ+0x261)[0x2aaaab0b03c1]
> [cgpu16:55192] [14] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(+0x155144e)[0x2aaaac22444e]
> [cgpu16:55192] [15] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(SNESTSFormFunction+0xa)[0x2aaaac1c9c1a]
> [cgpu16:55192] [16] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(SNESComputeFunction+0xf5)[0x2aaaac138675]
> [cgpu16:55192] [17] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(+0x14ac85e)[0x2aaaac17f85e]
> [cgpu16:55192] [18] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(SNESSolve+0x821)[0x2aaaac146651]
> [cgpu16:55192] [19] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(+0x155526c)[0x2aaaac22826c]
> [cgpu16:55192] [20] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(TSStep+0x1f5)[0x2aaaac1d6a05]
> [cgpu16:55192] [21] /global/homes/m/madams/petsc/arch-cori-gpu-opt-kokkos-gcc/lib/libpetsc.so.3.015(TSSolve+0x6a5)[0x2aaaac1dc455]
> [cgpu16:55192] [22] ../ex2-kok[0x4033eb]
> [cgpu16:55192] [23] /lib64/libc.so.6(__libc_start_main+0xea)[0x2aab12671f8a]
> [cgpu16:55192] [24] ../ex2-kok[0x404aaa]
> [cgpu16:55192] *** End of error message ***
> /global/homes/m/madams/mps-wrapper.sh: line 30: 55192 Aborted                 "$@"
> 0 stopping nvidia-cuda-mps-control on cgpu16

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210516/96d6fd1a/attachment.html>


More information about the petsc-users mailing list