[petsc-users] GPUs, cud, complex

Randall Mackie rlmackie862 at gmail.com
Fri Feb 22 16:48:28 CST 2019


Below is more systematic information about my attempts to run complex on a GPU. Thus far I have been unsuccessful, and any pointers/suggestions are appreciated. Next week I will try another machine with a different GPU.

In the meantime, here is what i get:


We have a machine on which we have 1 GPU:

GPU 0: GeForce GTX 1050 Ti 

We have installed the latest version of Cuda:

Cuda compilation tools, release 10.0, V10.0.130


I have compiled a gfortran version of petsc in complex:

./configure \
  --with-clean=1 \
  --with-scalar-type=complex \
  --with-debugging=1 \
  --with-fortran=1 \
  --with-cuda=1 \
  --with-cudac=/usr/local/cuda-10.0/bin/nvcc \
  --download-mpich=./mpich-3.3b1.tar.gz \
  --download-fblaslapack=fblaslapack-3.4.2.tar.gz \


To test I compiled ksp/ksp/examples/tests/ex32.c with a real version of petsc.
Run on the CPU with the following options I get the following output:

mpirun -np 1 ./ex32 \
  -dm_mat_type sbaij \
  -pc_type jacobi \
  -ksp_monitor_short

# ./cmd_ex32
  0 KSP Residual norm 0.076964 
  1 KSP Residual norm 0.0461195 
  2 KSP Residual norm 0.0281244 
  3 KSP Residual norm 0.0182566 
  4 KSP Residual norm 0.00902356 
  5 KSP Residual norm 0.00360423 
  6 KSP Residual norm 0.0010161 
  7 KSP Residual norm 0.000379953 
  8 KSP Residual norm 0.000168001 
  9 KSP Residual norm 0.000100787 
 10 KSP Residual norm 6.30212e-05 
 11 KSP Residual norm 3.74954e-05 
 12 KSP Residual norm 1.30554e-05 
 13 KSP Residual norm 3.7475e-06 
 14 KSP Residual norm 1.7272e-06 
 15 KSP Residual norm 6.5817e-07 


When I run on the GPU, I get the following:

${PETSC_DIR}/lib/petsc/bin/petscmpiexec -n 1 ./ex32_real \
  -dm_mat_type aijcusparse \
  -dm_vec_type cuda \
  -pc_type jacobi \
  -ksp_monitor_short \


# ./cmd_ex32_cuda 
  0 KSP Residual norm 0.076964 
  1 KSP Residual norm 0.0456494 
  2 KSP Residual norm 0.0277184 
  3 KSP Residual norm 0.0156403 
  4 KSP Residual norm 0.0077786 
  5 KSP Residual norm 0.00211375 
  6 KSP Residual norm 0.000615725 
  7 KSP Residual norm 0.000169699 
  8 KSP Residual norm 6.17985e-05 
  9 KSP Residual norm 1.57243e-05 
 10 KSP Residual norm 8.98356e-07 
 11 KSP Residual norm 1.51097e-07 

When run with the complex debug version I get this:

# ./cmd_ex32_cmplx 
  0 KSP Residual norm < 1.e-11

(obviously this is not correct, even if it does not bomb out).

Running this example under valgrind gives tons of unhelpful messages like this:

0 KSP Residual norm < 1.e-11
==3767== 4 bytes in 1 blocks are possibly lost in loss record 9 of 1,618
==3767==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==3767==    by 0x1A6EE927: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A7A0BE4: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A7ACD4F: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A7AD94A: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A81F045: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A81F301: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A81FAD7: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A82033D: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A69E8EB: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A6A092E: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A5D530B: ??? (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x1A70C5AA: cuDevicePrimaryCtxRetain (in /usr/lib64/libcuda.so.410.48)
==3767==    by 0x12376D2F: ??? (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767==    by 0x12376E9B: ??? (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767==    by 0x1237787E: ??? (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767==    by 0x123782E7: ??? (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767==    by 0x1236B43D: ??? (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767==    by 0x1235ADE7: ??? (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767==    by 0x1238C23B: cudaMalloc (in /usr/local/cuda-10.0/lib64/libcudart.so.10.0.130)
==3767== 



When I try ex39.c in the same directory (ksp/ksp/examples/tests), under valgrind, I get this:

# ./cmd_ex39
==3684== Warning: noted but unhandled ioctl 0x30000001 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x27 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x7ff with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x25 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x37 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x17 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x19 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x21 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x1b with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Warning: noted but unhandled ioctl 0x42 with no size/direction hints.
==3684==    This could cause spurious value errors to appear.
==3684==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3684== Invalid read of size 8
==3684==    at 0x50F7C6C: VecCUDAGetArrayRead (veccuda2.cu:1284)
==3684==    by 0x50EB8C3: VecAYPX_SeqCUDA (veccuda2.cu:188)
==3684==    by 0x52F741C: VecAYPX (rvector.c:750)
==3684==    by 0x6701CF5: KSPBuildResidualDefault (iterativ.c:891)
==3684==    by 0x66EA04E: KSPBuildResidual (itfunc.c:2142)
==3684==    by 0x66FB812: KSPMonitorTrueResidualNorm (iterativ.c:259)
==3684==    by 0x66E7272: KSPMonitor (itfunc.c:1716)
==3684==    by 0x6673AB4: KSPSolve_BCGS (bcgs.c:64)
==3684==    by 0x66DCA48: KSPSolve (itfunc.c:780)
==3684==    by 0x402601: main (ex39.c:111)
==3684==  Address 0xa0 is not stack'd, malloc'd or (recently) free'd
==3684== 
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: [0] VecCUDAGetArrayRead line 1283 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
[0]PETSC ERROR: [0] VecAYPX_SeqCUDA line 185 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
[0]PETSC ERROR: [0] VecAYPX line 739 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: [0] KSPBuildResidualDefault line 886 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/iterativ.c
[0]PETSC ERROR: [0] KSPBuildResidual line 2132 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: [0] KSPMonitorTrueResidualNorm line 252 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/iterativ.c
[0]PETSC ERROR: [0] KSPMonitor line 1714 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: [0] KSPSolve_BCGS line 33 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/impls/bcgs/bcgs.c
[0]PETSC ERROR: [0] KSPSolve line 678 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 
[0]PETSC ERROR: ./ex39_cmplx on a linux-gfortran-complex-debug named GPU by root Fri Feb 22 14:43:35 2019
[0]PETSC ERROR: Configure options --with-clean=1 --with-scalar-type=complex --with-debugging=1 --with-fortran=1 --with-cuda=1 --with-cudac=/usr/local/cuda-10.0/bin/nvcc --download-mpich=./mpich-3.3b1.tar.gz --download-fblaslapack=fblaslapack-3.4.2.tar.gz
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0


I also tried another example, snes/examples/tutorials/ex19.c, which gives similar messages:

# ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -pc_type none -ksp_type fgmres -snes_monitor_short -snes_rtol 1.e-5
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
  0 SNES Function norm 0.239155 
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: [0] VecCUDAGetArrayRead line 1283 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
[0]PETSC ERROR: [0] VecAXPY_SeqCUDA line 215 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
[0]PETSC ERROR: [0] VecAXPY line 597 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: [0] MatFDColoringApply_AIJ line 176 /home/everderio/DEV/petsc-3.10.3/src/mat/impls/aij/mpi/fdmpiaij.c
[0]PETSC ERROR: [0] MatFDColoringApply line 614 /home/everderio/DEV/petsc-3.10.3/src/mat/matfd/fdmatrix.c
[0]PETSC ERROR: [0] SNESComputeJacobian_DMDA line 153 /home/everderio/DEV/petsc-3.10.3/src/snes/utils/dmdasnes.c
[0]PETSC ERROR: [0] SNES user Jacobian function line 2555 /home/everderio/DEV/petsc-3.10.3/src/snes/interface/snes.c
[0]PETSC ERROR: [0] SNESComputeJacobian line 2514 /home/everderio/DEV/petsc-3.10.3/src/snes/interface/snes.c
[0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 144 /home/everderio/DEV/petsc-3.10.3/src/snes/impls/ls/ls.c
[0]PETSC ERROR: [0] SNESSolve line 4282 /home/everderio/DEV/petsc-3.10.3/src/snes/interface/snes.c
[0]PETSC ERROR: [0] main line 108 /home/everderio/DEV/petsc-3.10.3/src/snes/examples/tutorials/ex19.c
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 
[0]PETSC ERROR: ./ex19 on a linux-gfortran-complex-debug named GPU by root Fri Feb 22 17:24:34 2019
[0]PETSC ERROR: Configure options --with-clean=1 --with-scalar-type=complex --with-debugging=1 --with-fortran=1 --with-cuda=1 --CUDAFLAGS=-arch=sm_61 --with-cudac=/usr/local/cuda-10.0/bin/nvcc --download-mpich=./mpich-3.3b1.tar.gz --download-fblaslapack=fblaslapack-3.4.2.tar.gz
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=59
:
system msg for write_line failure : Bad file descriptor


Thanks for any help or suggestions about getting complex to work with GPUS.

Randy M.


> On Feb 21, 2019, at 9:53 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
> 
>   Hmm, ex32 suddenly becomes ex39 (and there is no ex39 in the src/ksp/ksp/examples/tutorials/ directory?) I try ex32 with those options and it runs though the -n1 n2 n3 options aren't used.
> 
>   Barry
> 
> 
>> On Feb 21, 2019, at 6:20 PM, Randall Mackie <rlmackie862 at gmail.com> wrote:
>> 
>> Hi Barry and Satish,
>> 
>> Yes, sorry, I meant -dm_mat_type_aijcusparse…..
>> 
>> Here is an attempt to run ex39 under complex:
>> 
>> 0]PETSC ERROR: ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [0]PETSC ERROR: likely location of problem given in stack below
>> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [0]PETSC ERROR:       is given.
>> [0]PETSC ERROR: [0] VecCUDAGetArrayRead line 1283 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
>> [0]PETSC ERROR: [0] VecAYPX_SeqCUDA line 185 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
>> [0]PETSC ERROR: [0] VecAYPX line 739 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/interface/rvector.c
>> [0]PETSC ERROR: [0] KSPBuildResidualDefault line 886 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/iterativ.c
>> [0]PETSC ERROR: [0] KSPBuildResidual line 2132 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: [0] KSPMonitorTrueResidualNorm line 252 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/iterativ.c
>> [0]PETSC ERROR: [0] KSPMonitor line 1714 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: [0] KSPSolve_BCGS line 33 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/impls/bcgs/bcgs.c
>> [0]PETSC ERROR: [0] KSPSolve line 678 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 
>> [0]PETSC ERROR: ./ex39_cmplx on a linux-gfortran-complex-debug named GPU by root Thu Feb 21 19:03:37 2019
>> [0]PETSC ERROR: Configure options --with-clean=1 --with-scalar-type=complex --with-debugging=1 --with-fortran=1 --with-cuda=1 --with-cudac=/usr/local/cuda-10.0/bin/nvcc --download-mpich=./mpich-3.3b1.tar.gz --download-fblaslapack=fblaslapack-3.4.2.tar.gz
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>> 
>> 
>> I used these options:
>> 
>> #!/bin/bash
>> 
>> export PETSC_ARCH=linux-gfortran-complex-debug
>> 
>> ${PETSC_DIR}/lib/petsc/bin/petscmpiexec -n 1 ./ex39_cmplx \
>> -ksp_type bcgs \
>> -ksp_rtol 1.e-6 \
>> -pc_type jacobi \
>> -ksp_monitor_true_residual \
>> -ksp_converged_reason \
>> -mat_type aijcusparse \
>> -vec_type cuda \
>> -n1 32 \
>> -n2 32 \
>> -n3 32 \
>> 
>> 
>> My next step was going to try valgrind and see if that turned something up.
>> 
>> Thanks, Randy
>> 
>> 
>>> On Feb 21, 2019, at 2:51 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> Randy,
>>> 
>>>  Could you please cut and paste the entire error message you get. It worked for me. 
>>> 
>>>  I assume you mean -dm_mat_type aijcusparse  not aijcuda (which doesn't exist).
>>> 
>>> Satish,
>>> 
>>>    I does appear we do not have a nightly test for cuda and complex, could that test be added to the nightly sweeps?
>>> 
>>> Thanks
>>> 
>>> Barry
>>> 
>>> 
>>>> On Feb 14, 2019, at 7:33 PM, Randall Mackie via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>> 
>>>> We are testing whether or not we can benefit from porting our PETSc code to use GPUS.
>>>> We have installed PETSc following the instructions here:
>>>> https://www.mcs.anl.gov/petsc/features/gpus.html
>>>> 
>>>> Using KSP example 32 (ex32.c) to test, and using -dm_vec_type cuda and -dm_mat_type aijcuda 
>>>> then ex32 runs fine when compiled with a REAL version of PETSc but bombs out when using a COMPLEX version.
>>>> 
>>>> Is it possible to run PETSc on GPUS in complex mode?d
>>>> 
>>>> 
>>>> Thanks, Randy M.
>>> 
>> 
> 



More information about the petsc-users mailing list