[petsc-users] GPUs, cud, complex

Smith, Barry F. bsmith at mcs.anl.gov
Fri Feb 22 22:53:12 CST 2019


  I am getting some crashes on our machine frog with complex cuda on those examples, though maybe not at the exact same line numbers.

  We need more help from our CUDA experts

   Barry


> On Feb 22, 2019, at 8:33 AM, Randall Mackie <rlmackie862 at gmail.com> wrote:
> 
> Sorry, I’ve tried both ex32 and ex39 in the src/ksp/ksp/examples/tests directory, both would give the same error.
> I’ll try with valgrind or try another computer with a different GPU.
> 
> Thanks for confirming that complex *should* work on GPUs.
> 
> Randy M.
> 
> 
> 
>> On Feb 21, 2019, at 9:53 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>  Hmm, ex32 suddenly becomes ex39 (and there is no ex39 in the src/ksp/ksp/examples/tutorials/ directory?) I try ex32 with those options and it runs though the -n1 n2 n3 options aren't used.
>> 
>>  Barry
>> 
>> 
>>> On Feb 21, 2019, at 6:20 PM, Randall Mackie <rlmackie862 at gmail.com> wrote:
>>> 
>>> Hi Barry and Satish,
>>> 
>>> Yes, sorry, I meant -dm_mat_type_aijcusparse…..
>>> 
>>> Here is an attempt to run ex39 under complex:
>>> 
>>> 0]PETSC ERROR: ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [0]PETSC ERROR: likely location of problem given in stack below
>>> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [0]PETSC ERROR:       is given.
>>> [0]PETSC ERROR: [0] VecCUDAGetArrayRead line 1283 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
>>> [0]PETSC ERROR: [0] VecAYPX_SeqCUDA line 185 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
>>> [0]PETSC ERROR: [0] VecAYPX line 739 /home/everderio/DEV/petsc-3.10.3/src/vec/vec/interface/rvector.c
>>> [0]PETSC ERROR: [0] KSPBuildResidualDefault line 886 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/iterativ.c
>>> [0]PETSC ERROR: [0] KSPBuildResidual line 2132 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: [0] KSPMonitorTrueResidualNorm line 252 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/iterativ.c
>>> [0]PETSC ERROR: [0] KSPMonitor line 1714 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: [0] KSPSolve_BCGS line 33 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/impls/bcgs/bcgs.c
>>> [0]PETSC ERROR: [0] KSPSolve line 678 /home/everderio/DEV/petsc-3.10.3/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [0]PETSC ERROR: Signal received
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 
>>> [0]PETSC ERROR: ./ex39_cmplx on a linux-gfortran-complex-debug named GPU by root Thu Feb 21 19:03:37 2019
>>> [0]PETSC ERROR: Configure options --with-clean=1 --with-scalar-type=complex --with-debugging=1 --with-fortran=1 --with-cuda=1 --with-cudac=/usr/local/cuda-10.0/bin/nvcc --download-mpich=./mpich-3.3b1.tar.gz --download-fblaslapack=fblaslapack-3.4.2.tar.gz
>>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>> 
>>> 
>>> I used these options:
>>> 
>>> #!/bin/bash
>>> 
>>> export PETSC_ARCH=linux-gfortran-complex-debug
>>> 
>>> ${PETSC_DIR}/lib/petsc/bin/petscmpiexec -n 1 ./ex39_cmplx \
>>> -ksp_type bcgs \
>>> -ksp_rtol 1.e-6 \
>>> -pc_type jacobi \
>>> -ksp_monitor_true_residual \
>>> -ksp_converged_reason \
>>> -mat_type aijcusparse \
>>> -vec_type cuda \
>>> -n1 32 \
>>> -n2 32 \
>>> -n3 32 \
>>> 
>>> 
>>> My next step was going to try valgrind and see if that turned something up.
>>> 
>>> Thanks, Randy
>>> 
>>> 
>>>> On Feb 21, 2019, at 2:51 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>> 
>>>> Randy,
>>>> 
>>>> Could you please cut and paste the entire error message you get. It worked for me. 
>>>> 
>>>> I assume you mean -dm_mat_type aijcusparse  not aijcuda (which doesn't exist).
>>>> 
>>>> Satish,
>>>> 
>>>>   I does appear we do not have a nightly test for cuda and complex, could that test be added to the nightly sweeps?
>>>> 
>>>> Thanks
>>>> 
>>>> Barry
>>>> 
>>>> 
>>>>> On Feb 14, 2019, at 7:33 PM, Randall Mackie via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>>> 
>>>>> We are testing whether or not we can benefit from porting our PETSc code to use GPUS.
>>>>> We have installed PETSc following the instructions here:
>>>>> https://www.mcs.anl.gov/petsc/features/gpus.html
>>>>> 
>>>>> Using KSP example 32 (ex32.c) to test, and using -dm_vec_type cuda and -dm_mat_type aijcuda 
>>>>> then ex32 runs fine when compiled with a REAL version of PETSc but bombs out when using a COMPLEX version.
>>>>> 
>>>>> Is it possible to run PETSc on GPUS in complex mode?d
>>>>> 
>>>>> 
>>>>> Thanks, Randy M.
>>>> 
>>> 
>> 
> 



More information about the petsc-users mailing list