[petsc-dev] kokkos test fail in branch

Barry Smith bsmith at petsc.dev
Sat Jun 5 21:42:33 CDT 2021


  Looks like the MPI libraries are not being passed to the NVCC internal compiler (gcc). This would normally be setup in MPI.py, please send configure.log 

  Barry
 

> On Jun 5, 2021, at 1:18 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> Ah, there was an old one there. I removed it and I can not compile ex3k. I thought check always rebuilt executables. Now I get a link error:
> 
> 14:12 barry/2020-11-11/cleanup-matsetvaluesdevice *= /gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials$ make PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc PETSC_ARCH=arch-summit-opt-gnu-kokkos-notpl-cuda10 ex3k
> PATH=/sw/sources/lsf-tools/2.0/summit/bin:/sw/summit/xalt/1.2.1/bin:/sw/summit/forge/20.0.1/bin:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/bin:/sw/summit/gcc/6.4.0/bin:/sw/summit/cuda/10.1.243/bin:/autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.20.2-24ualfzy6em6ws5sbiu7rlgcuionodrm/bin:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/darshan-runtime-3.1.7-cnvxicgf5j4ap64qi6v5gxp67hmrjz43/bin:/sw/sources/hpss/bin:/opt/ibm/spectrumcomputing/lsf/10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/etc:/opt/ibm/spectrumcomputing/lsf/10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/bin:/opt/ibm/csm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibm/flightlog/bin:/opt/ibutils/bin:/opt/ibm/spectrum_mpi/jsm_pmix/bin:/opt/puppetlabs/bin:/usr/lpp/mmfs/bin:`dirname <http://10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/etc:/opt/ibm/spectrumcomputing/lsf/10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/bin:/opt/ibm/csm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibm/flightlog/bin:/opt/ibutils/bin:/opt/ibm/spectrum_mpi/jsm_pmix/bin:/opt/puppetlabs/bin:/usr/lpp/mmfs/bin:%60dirname> nvcc` NVCC_WRAPPER_DEFAULT_COMPILER=gcc /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-notpl-cuda10/bin/nvcc_wrapper --expt-extended-lambda -Xcompiler -rdynamic -lineinfo -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DPETSC_HAVE_CUDA_ATOMIC -DLANDAU_MAX_Q=4 -Xcompiler -fPIC -O3  -gencode arch=compute_70,code=sm_70  -I/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/include -Wno-deprecated-gpu-targets  -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-notpl-cuda10/include -I/sw/summit/cuda/10.1.243/include    -I/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/include   -fPIC -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DPETSC_HAVE_CUDA_ATOMIC -DLANDAU_MAX_Q=4 -Werror=maybe-uninitialized -O0   ex3k.kokkos.cxx  -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-notpl-cuda10/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-notpl-cuda10/lib -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-notpl-cuda10/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-notpl-cuda10/lib -L/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/netlib-lapack-3.8.0-wcabdyqhdi5rooxbkqa6x5d7hxyxwdkm/lib64 -Wl,-rpath,/sw/summit/cuda/10.1.243/lib64 -L/sw/summit/cuda/10.1.243/lib64 -lpetsc -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lblas -llapack -ltriangle -lm -lz -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lstdc++ -ldl -o ex3k
> nvcc_wrapper - *warning* you have set multiple optimization flags (-O*), only the last is used because nvcc can only accept a single optimization setting.
> /usr/bin/ld: /tmp/tmpxft_0000e0c2_00000000-10_ex3k.kokkos.o: undefined reference to symbol 'ompi_mpi_comm_self'
> /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libmpi_ibm.so.3: error adding symbols: DSO missing from command line
> collect2: error: ld returned 1 exit status
> make: *** [ex3k] Error 1
> 
> On Sat, Jun 5, 2021 at 12:16 PM Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
> $rm ex3k
> $make ex3k
> and run again?
> 
> --Junchao Zhang
> 
> 
> On Sat, Jun 5, 2021 at 10:25 AM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
> This is posted in Barry's MR, but I get this error with Kokkos-cuda on Summit. Failing to open a shared lib.
> Thoughts?
> Mark
> 
> 11:15 barry/2020-11-11/cleanup-matsetvaluesdevice= /gpfs/alpine/csc314/scratch/adams/petsc$ make PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc PETSC_ARCH=arch-summit-opt-gnu-kokkos-notpl-cuda10 check
> Running check examples to verify correct installation
> Using PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc and PETSC_ARCH=arch-summit-opt-gnu-kokkos-notpl-cuda10
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> C/C++ example src/snes/tutorials/ex19 run successfully with cuda
> gmake[3]: [runex3k_kokkos] Error 127 (ignored)
> 1,25c1,2
> < atol=1e-50, rtol=1e-08, stol=1e-08, maxit=50, maxf=10000
> < Vec Object: Exact Solution 2 MPI processes
> <   type: mpikokkos
> < Process [0]
> < 0.
> < 0.015625
> < 0.125
> < Process [1]
> < 0.421875
> < 1.
> < Vec Object: Forcing function 2 MPI processes
> <   type: mpikokkos
> < Process [0]
> < 1e-72
> < 1.50024
> < 3.01563
> < Process [1]
> < 4.67798
> < 7.
> <   0 SNES Function norm 5.414682427127e+00
> <   1 SNES Function norm 2.952582418265e-01
> <   2 SNES Function norm 4.502293658739e-04
> <   3 SNES Function norm 1.389665806646e-09
> < Number of SNES iterations = 3
> < Norm of error 1.49752e-10 Iterations 3
> ---
> > ./ex3k: error while loading shared libraries: libpetsc.so.3.015: cannot open shared object file: No such file or directory
> > ./ex3k: error while loading shared libraries: libpetsc.so.3.015: cannot open shared object file: No such file or directory
> /gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210605/3a11c99d/attachment.html>


More information about the petsc-dev mailing list