<div dir="ltr"><div>Ah, OK 'check' will test SuperLU. Semi worked:</div><div><br></div>s20:13 mark/feature-xgc-interface-rebase *= ~/petsc$ make PETSC_DIR=/ccs/home/adams/petsc PETSC_ARCH=arch-summit-dbg-gnu-cuda-omp check<br>Running check examples to verify correct installation<br>Using PETSC_DIR=/ccs/home/adams/petsc and PETSC_ARCH=arch-summit-dbg-gnu-cuda-omp<br>C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process<br>C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes<br>2c2,38<br>< Number of SNES iterations = 2<br>---<br>> CUDA version: v 10010<br>> CUDA Devices: <br>> <br>> 0 : Tesla V100-SXM2-16GB 7 0<br>> Global memory: 16128 mb <br>> Shared memory: 48 kb <br>> Constant memory: 64 kb <br>> Block registers: 65536 <br>> <br>> ex19: cudahook.cc:762: CUresult host_free_callback(void*): Assertion `cacheNode != __null' failed.<br>> [h16n07:78357] *** Process received signal ***<br>> [h16n07:78357] Signal: Aborted (6)<br>> [h16n07:78357] Signal code: (1704218624)<br>> [h16n07:78357] [ 0] [0x2000000504d8]<br>> [h16n07:78357] [ 1] /lib64/libc.so.6(abort+0x2b4)[0x200023992094]<br>> [h16n07:78357] [ 2] /lib64/libc.so.6(+0x356d4)[0x2000239856d4]<br>> [h16n07:78357] [ 3] /lib64/libc.so.6(__assert_fail+0x64)[0x2000239857c4]<br>> [h16n07:78357] [ 4] /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/container/../lib/libpami_cudahook.so(_Z18host_free_callbackPv+0x2d8)[0x2000000cd2c8]<br>> [h16n07:78357] [ 5] /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/container/../lib/libpami_cudahook.so(cuMemFreeHost+0xb0)[0x2000000c3cc0]<br>> [h16n07:78357] [ 6] /sw/summit/cuda/10.1.243/lib64/libcudart.so.10.1(+0x42f50)[0x200010aa2f50]<br>> [h16n07:78357] [ 7] /sw/summit/cuda/10.1.243/lib64/libcudart.so.10.1(+0x11db8)[0x200010a71db8]<br>> [h16n07:78357] [ 8] /sw/summit/cuda/10.1.243/lib64/libcudart.so.10.1(cudaFreeHost+0x74)[0x200010ab2ea4]<br>> [h16n07:78357] [ 9] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libsuperlu_dist.so.6(dDestroy_LU+0x150)[0x200003188058]<br>> [h16n07:78357] [10] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(+0x12ebc6c)[0x2000013dbc6c]<br>> [h16n07:78357] [11] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(MatLUFactorNumeric+0x934)[0x200000d2fae4]<br>> [h16n07:78357] [12] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(+0x1cca7a4)[0x200001dba7a4]<br>> [h16n07:78357] [13] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(PCSetUp+0xde0)[0x200001f3f990]<br>> [h16n07:78357] [14] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(KSPSetUp+0x1848)[0x200001fc5594]<br>> [h16n07:78357] [15] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(+0x1ed9908)[0x200001fc9908]<br>> [h16n07:78357] [16] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(KSPSolve+0x5d0)[0x200001fcc690]<br>> [h16n07:78357] [17] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(+0x21e16ac)[0x2000022d16ac]<br>> [h16n07:78357] [18] /ccs/home/adams/petsc/arch-summit-dbg-gnu-cuda-omp/lib/libpetsc.so.3.013(SNESSolve+0x23f4)[0x2000022255c0]<br>> [h16n07:78357] [19] ./ex19[0x10002ac8]<br>> [h16n07:78357] [20] /lib64/libc.so.6(+0x25200)[0x200023975200]<br>> [h16n07:78357] [21] /lib64/libc.so.6(__libc_start_main+0xc4)[0x2000239753f4]<br>> [h16n07:78357] *** End of error message ***<br>> ERROR: One or more process (first noticed rank 0) terminated with signal 6<br>/ccs/home/adams/petsc/src/snes/tutorials<br>Possible problem with ex19 running with superlu_dist, diffs above<br>=========================================<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 15, 2020 at 5:58 PM Satish Balay <<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Please send configure.log<br>
<br>
This is what I get on my linux build:<br>
<br>
[balay@p1 petsc]$ ./configure --with-mpi-dir=/home/petsc/soft/openmpi-4.0.2-cuda --with-cuda=1 --with-openmp=1 --download-superlu-dist=1 && make && make check<br>
<snip><br>
Running check examples to verify correct installation<br>
Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug<br>
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process<br>
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes<br>
1a2,19<br>
> CUDA version: v 10020<br>
> CUDA Devices: <br>
> <br>
> 0 : Quadro T2000 7 5<br>
> Global memory: 3911 mb <br>
> Shared memory: 48 kb <br>
> Constant memory: 64 kb <br>
> Block registers: 65536 <br>
> <br>
> CUDA version: v 10020<br>
> CUDA Devices: <br>
> <br>
> 0 : Quadro T2000 7 5<br>
> Global memory: 3911 mb <br>
> Shared memory: 48 kb <br>
> Constant memory: 64 kb <br>
> Block registers: 65536 <br>
> <br>
/home/balay/petsc/src/snes/tutorials<br>
Possible problem with ex19 running with superlu_dist, diffs above<br>
=========================================<br>
Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process<br>
Completed test examples<br>
<br>
<br>
On Wed, 15 Apr 2020, Mark Adams wrote:<br>
<br>
> On Wed, Apr 15, 2020 at 5:17 PM Satish Balay <<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>> wrote:<br>
> <br>
> > The build should work. It should give some verbose info [at runtime]<br>
> > regarding GPUs - from the following code.<br>
> ><br>
> ><br>
> I don't see that and I am running GPUs in my code and have gotten cusparse<br>
> LU to run. Should I use '-info :sys:' ?<br>
> <br>
> <br>
> > >>>>> SRC/cublas_utils.c >>>>>>>>>>><br>
> > void DisplayHeader()<br>
> > {<br>
> > const int kb = 1024;<br>
> > const int mb = kb * kb;<br>
> > // cout << "NBody.GPU" << endl << "=========" << endl << endl;<br>
> ><br>
> > printf("CUDA version: v %d\n",CUDART_VERSION);<br>
> > //cout << "Thrust version: v" << THRUST_MAJOR_VERSION << "." <<<br>
> > THRUST_MINOR_VERSION << endl << endl;<br>
> ><br>
> > int devCount;<br>
> > cudaGetDeviceCount(&devCount);<br>
> > printf( "CUDA Devices: \n \n");<br>
> > <snip><br>
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<br>
> ><br>
> > Satish<br>
> ><br>
> > On Wed, 15 Apr 2020, Junchao Zhang wrote:<br>
> ><br>
> > > I remember Barry said superlu gpu support is broken.<br>
> > > --Junchao Zhang<br>
> > ><br>
> > ><br>
> > > On Wed, Apr 15, 2020 at 3:47 PM Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> wrote:<br>
> > ><br>
> > > > How does one use SuperLU with GPUs. I don't seem to get any GPU<br>
> > > > performance data so I assume GPUs are not getting turned on. Am I wrong<br>
> > > > about that?<br>
> > > ><br>
> > > > I configure with:<br>
> > > > configure options: --with-fc=0 --COPTFLAGS="-g -O2 -fPIC -fopenmp"<br>
> > > > --CXXOPTFLAGS="-g -O2 -fPIC -fopenmp" --FOPTFLAGS="-g -O2 -fPIC<br>
> > -fopenmp"<br>
> > > > --CUDAOPTFLAGS="-O2 -g" --with-ssl=0 --with-batch=0 --with-cxx=mpicxx<br>
> > > > --with-mpiexec="jsrun -g1" --with-cuda=1 --with-cudac=nvcc<br>
> > > > --download-p4est=1 --download-zlib --download-hdf5=1 --download-metis<br>
> > > > --download-superlu --download-superlu_dist --with-make-np=16<br>
> > > > --download-parmetis --download-triangle<br>
> > > ><br>
> > --with-blaslapack-lib="-L/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/netlib-lapack-3.8.0-wcabdyqhdi5rooxbkqa6x5d7hxyxwdkm/lib64<br>
> > > > -lblas -llapack" --with-cc=mpicc --with-shared-libraries=1 --with-x=0<br>
> > > > --with-64-bit-indices=0 --with-debugging=0<br>
> > > > PETSC_ARCH=arch-summit-opt-gnu-cuda-omp --with-openmp=1<br>
> > > > --with-threadsaftey=1 --with-log=1<br>
> > > ><br>
> > > > Thanks,<br>
> > > > Mark<br>
> > > ><br>
> > ><br>
> ><br>
> ><br>
> <br>
<br>
</blockquote></div>