[petsc-users] Tweaking my code for CUDA
Manuel Valera
mvalera-w at mail.sdsu.edu
Wed Mar 14 12:27:36 CDT 2018
Ok thanks Matt, i made a smaller case with only the linear solver and a
25x25 matrix, the error i have in this case is:
[valera at node50 alone]$ mpirun -n 1 ./linsolve -vec_type cusp -mat_type
aijcusparse
laplacian.petsc !
TrivSoln loaded, size: 125 / 125
RHS loaded, size: 125 / 125
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Null argument, when expecting valid pointer
[0]PETSC ERROR: Null Pointer: Parameter # 4
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.8.3-1817-g96b6f8a GIT
Date: 2018-02-28 10:19:08 -0600
[0]PETSC ERROR: ./linsolve on a cuda named node50 by valera Wed Mar 14
10:24:35 2018
[0]PETSC ERROR: Configure options PETSC_ARCH=cuda --with-cc=mpicc
--with-cxx=mpic++ --with-fc=mpifort --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3
--FOPTFLAGS=-O3 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
--with-cuda-arch=sm_60 --with-cusp=1 --with-cusp-dir=/home/valera/cusp
--with-vienacl=1 --download-fblaslapack=1 --download-hypre
[0]PETSC ERROR: #1 VecSetValues() line 851 in
/home/valera/petsc/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Object (seq) is not seqcusp or mpicusp
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.8.3-1817-g96b6f8a GIT
Date: 2018-02-28 10:19:08 -0600
[0]PETSC ERROR: ./linsolve on a cuda named node50 by valera Wed Mar 14
10:24:35 2018
[0]PETSC ERROR: Configure options PETSC_ARCH=cuda --with-cc=mpicc
--with-cxx=mpic++ --with-fc=mpifort --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3
--FOPTFLAGS=-O3 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
--with-cuda-arch=sm_60 --with-cusp=1 --with-cusp-dir=/home/valera/cusp
--with-vienacl=1 --download-fblaslapack=1 --download-hypre
[0]PETSC ERROR: #2 VecCUSPGetArrayRead() line 1792 in
/home/valera/petsc/src/vec/vec/impls/seq/seqcusp/veccusp2.cu
[0]PETSC ERROR: #3 VecAXPY_SeqCUSP() line 314 in
/home/valera/petsc/src/vec/vec/impls/seq/seqcusp/veccusp2.cu
[0]PETSC ERROR: #4 VecAXPY() line 612 in
/home/valera/petsc/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: #5 KSPSolve_GCR_cycle() line 60 in
/home/valera/petsc/src/ksp/ksp/impls/gcr/gcr.c
[0]PETSC ERROR: #6 KSPSolve_GCR() line 114 in
/home/valera/petsc/src/ksp/ksp/impls/gcr/gcr.c
[0]PETSC ERROR: #7 KSPSolve() line 669 in
/home/valera/petsc/src/ksp/ksp/interface/itfunc.c
soln maxval: 0.0000000000000000
soln minval: 0.0000000000000000
Norm: 11.180339887498949
Its: 0
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
Option left: name:-mat_type value: aijcusparse
[valera at node50 alone]$
I also see the configure options are not correct, so i guess is still
linking a different petsc installation, but maybe we can try to make it
work as it is, i will let you know if i am able to link the correct petsc
installation here,
Best,
On Sun, Mar 11, 2018 at 9:00 AM, Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, Mar 9, 2018 at 3:05 AM, Manuel Valera <mvalera-w at mail.sdsu.edu>
> wrote:
>
>> Hello all,
>>
>> I am working on porting a linear solver into GPUs for timing purposes, so
>> far i've been able to compile and run the CUSP libraries and compile PETSc
>> to be used with CUSP and ViennaCL, after the initial runs i noticed some
>> errors, they are different for different flags and i would appreciate any
>> help interpreting them,
>>
>> The only elements in this program that use PETSc are the laplacian matrix
>> (sparse), the RHS and X vectors and a scatter petsc object, so i would say
>> it's safe to pass the command line arguments for the Mat/VecSetType()s
>> instead of changing the source code,
>>
>> If i use *-vec_type cuda -mat_type aijcusparse* or *-vec_type viennacl
>> -mat_type aijviennacl *i get the following:
>>
>
> These systems do not properly propagate errors. My only advice is to run a
> smaller problem and see.
>
>
>> [0]PETSC ERROR: ------------------------------
>> ------------------------------------------
>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/d
>> ocumentation/faq.html#valgrind
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>> X to find memory corruption errors
>> [0]PETSC ERROR: likely location of problem given in stack below
>> [0]PETSC ERROR: --------------------- Stack Frames
>> ------------------------------------
>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> [0]PETSC ERROR: INSTEAD the line number of the start of the function
>> [0]PETSC ERROR: is given.
>> [0]PETSC ERROR: [0] VecSetValues line 847 /home/valera/petsc/src/vec/vec
>> /interface/rvector.c
>> [0]PETSC ERROR: [0] VecSetType line 36 /home/valera/petsc/src/vec/vec
>> /interface/vecreg.c
>> [0]PETSC ERROR: [0] VecSetTypeFromOptions_Private line 1230
>> /home/valera/petsc/src/vec/vec/interface/vector.c
>> [0]PETSC ERROR: [0] VecSetFromOptions line 1271
>> /home/valera/petsc/src/vec/vec/interface/vector.c
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.8.3-1817-g96b6f8a GIT
>> Date: 2018-02-28 10:19:08 -0600
>> [0]PETSC ERROR: ./gcmSeamount on a cuda named node50 by valera Thu Mar 8
>> 09:50:51 2018
>> [0]PETSC ERROR: Configure options PETSC_ARCH=cuda --with-cc=mpicc
>> --with-cxx=mpic++ --with-fc=mpifort --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3
>> --FOPTFLAGS=-O3 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
>> --with-cuda-arch=sm_60 --with-cusp=1 --with-cusp-dir=/home/valera/cusp
>> --with-vienacl=1 --download-fblaslapack=1 --download-hypre
>> [0]PETSC ERROR: #5 User provided function() line 0 in unknown file
>> ------------------------------------------------------------
>> --------------
>>
>> This seems to be a memory out of range, maybe my vector is too big for my
>> CUDA system? how do i assess that?
>>
>>
>> Next, if i use *-vec_type cusp -mat_type aijcusparse *i get something
>> different and more interesting:
>>
>
> We need to see the entire error message, since it has the stack.
>
> This seems like a logic error, but could definitely be on our end. Here is
> how I think about these:
>
> 1) We have nightly test solves, so at least some solver configuration
> works
>
> 2) Some vector which is marked read-only (happens for input to solvers),
> but someone is trying to update it.
> The stack will tell me where this is happening.
>
> Thanks,
>
> Matt
>
>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Object is in wrong state
>> [0]PETSC ERROR: Vec is locked read only, argument # 3
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.8.3-1817-g96b6f8a GIT
>> Date: 2018-02-28 10:19:08 -0600
>> [0]PETSC ERROR: ./gcmSeamount on a cuda named node50 by valera Thu Mar 8
>> 10:02:19 2018
>> [0]PETSC ERROR: Configure options PETSC_ARCH=cuda --with-cc=mpicc
>> --with-cxx=mpic++ --with-fc=mpifort --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3
>> --FOPTFLAGS=-O3 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
>> --with-cuda-arch=sm_60 --with-cusp=1 --with-cusp-dir=/home/valera/cusp
>> --with-vienacl=1 --download-fblaslapack=1 --download-hypre
>> [0]PETSC ERROR: #48 KSPSolve() line 615 in /home/valera/petsc/src/ksp/ksp
>> /interface/itfunc.c
>> PETSC_SOLVER_ONLY 6.8672990892082453E-005 s
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Invalid argument
>> [0]PETSC ERROR: Object (seq) is not seqcusp or mpicusp
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.8.3-1817-g96b6f8a GIT
>> Date: 2018-02-28 10:19:08 -0600
>> [0]PETSC ERROR: ./gcmSeamount on a cuda named node50 by valera Thu Mar 8
>> 10:02:19 2018
>> [0]PETSC ERROR: Configure options PETSC_ARCH=cuda --with-cc=mpicc
>> --with-cxx=mpic++ --with-fc=mpifort --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3
>> --FOPTFLAGS=-O3 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
>> --with-cuda-arch=sm_60 --with-cusp=1 --with-cusp-dir=/home/valera/cusp
>> --with-vienacl=1 --download-fblaslapack=1 --download-hypre
>> [0]PETSC ERROR: #49 VecCUSPGetArrayReadWrite() line 1718 in
>> /home/valera/petsc/src/vec/vec/impls/seq/seqcusp/veccusp2.cu
>> [0]PETSC ERROR: #50 VecScatterCUSP_StoS() line 269 in
>> /home/valera/petsc/src/vec/vec/impls/seq/seqcusp/vecscattercusp.cu
>>
>>
>>
>>
>>
>> And it yields a "solution" to the system and also a log at the end:
>>
>>
>>
>>
>>
>> ./gcmSeamount on a cuda named node50 with 1 processor, by valera Thu Mar
>> 8 10:02:24 2018
>> Using Petsc Development GIT revision: v3.8.3-1817-g96b6f8a GIT Date:
>> 2018-02-28 10:19:08 -0600
>>
>> Max Max/Min Avg Total
>> Time (sec): 4.573e+00 1.00000 4.573e+00
>> Objects: 8.100e+01 1.00000 8.100e+01
>> Flop: 3.492e+07 1.00000 3.492e+07 3.492e+07
>> Flop/sec: 7.637e+06 1.00000 7.637e+06 7.637e+06
>> Memory: 2.157e+08 1.00000 2.157e+08
>> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
>> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
>> MPI Reductions: 0.000e+00 0.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>> e.g., VecAXPY() for real vectors of length N
>> --> 2N flop
>> and VecAXPY() for complex vectors of length N
>> --> 8N flop
>>
>> Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages
>> --- -- Message Lengths -- -- Reductions --
>> Avg %Total Avg %Total counts
>> %Total Avg %Total counts %Total
>> 0: Main Stage: 4.5729e+00 100.0% 3.4924e+07 100.0% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>>
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>> Phase summary info:
>> Count: number of times phase was executed
>> Time and Flop: Max - maximum over all processors
>> Ratio - ratio of maximum to minimum over all processors
>> Mess: number of messages sent
>> Avg. len: average message length (bytes)
>> Reduct: number of global reductions
>> Global: entire computation
>> Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>> %T - percent time in this phase %F - percent flop in this
>> phase
>> %M - percent messages in this phase %L - percent message
>> lengths in this phase
>> %R - percent reductions in this phase
>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
>> over all processors)
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>>
>>
>> ##########################################################
>> # #
>> # WARNING!!! #
>> # #
>> # This code was compiled with a debugging option, #
>> # To get timing results run ./configure #
>> # using --with-debugging=no, the performance will #
>> # be generally two or three times faster. #
>> # #
>> ##########################################################
>>
>>
>> Event Count Time (sec) Flop
>> --- Global --- --- Stage --- Total
>> Max Ratio Max Ratio Max Ratio Mess Avg len
>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> MatLUFactorNum 1 1.0 4.9502e-02 1.0 3.49e+07 1.0 0.0e+00 0.0e+00
>> 0.0e+00 1100 0 0 0 1100 0 0 0 706
>> MatILUFactorSym 1 1.0 1.9642e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAssemblyBegin 2 1.0 6.9141e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAssemblyEnd 2 1.0 2.6612e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 6 0 0 0 0 6 0 0 0 0 0
>> MatGetRowIJ 1 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatGetOrdering 1 1.0 1.7186e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatLoad 1 1.0 1.1575e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
>> MatView 1 1.0 8.0877e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
>> MatCUSPCopyTo 1 1.0 2.4664e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
>> VecSet 68 1.0 5.1665e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
>> VecAssemblyBegin 17 1.0 5.2691e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAssemblyEnd 17 1.0 4.3631e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecScatterBegin 15 1.0 1.5345e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecCUSPCopyFrom 1 1.0 1.1199e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> KSPSetUp 1 1.0 5.1929e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
>> PCSetUp 2 1.0 8.6590e-02 1.0 3.49e+07 1.0 0.0e+00 0.0e+00
>> 0.0e+00 2100 0 0 0 2100 0 0 0 403
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type Creations Destructions Memory Descendants'
>> Mem.
>> Reports information only for process 0.
>>
>> --- Event Stage 0: Main Stage
>>
>> Matrix 3 1 52856972 0.
>> Matrix Null Space 1 1 608 0.
>> Vector 66 3 3414600 0.
>> Vector Scatter 1 1 680 0.
>> Viewer 3 2 1680 0.
>> Krylov Solver 1 0 0 0.
>> Preconditioner 2 1 864 0.
>> Index Set 4 1 800 0.
>> ============================================================
>> ============================================================
>> Average time to get PetscTime(): 9.53674e-08
>> #PETSc Option Table entries:
>> -ksp_view
>> -log_view
>> -mat_type aijcusparse
>> -matload_block_size 1
>> -vec_type cusp
>> #End of PETSc Option Table entries
>> Compiled without FORTRAN kernels
>> Compiled with full precision matrices (default)
>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
>> Configure options: PETSC_ARCH=cuda --with-cc=mpicc --with-cxx=mpic++
>> --with-fc=mpifort --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3
>> --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
>> --with-cuda-arch=sm_60 --with-cusp=1 --with-cusp-dir=/home/valera/cusp
>> --with-vienacl=1 --download-fblaslapack=1 --download-hypre
>> -----------------------------------------
>> Libraries compiled on Mon Mar 5 16:37:18 2018 on node50
>> Machine characteristics: Linux-3.10.0-693.17.1.el7.x86_
>> 64-x86_64-with-centos-7.2.1511-Core
>> Using PETSc directory: /home/valera/petsc
>> Using PETSc arch: cuda
>> -----------------------------------------
>>
>> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings
>> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector
>> -fvisibility=hidden -O3
>> Using Fortran compiler: mpifort -fPIC -Wall -ffree-line-length-0
>> -Wno-unused-dummy-argument -O3
>> -----------------------------------------
>>
>> Using include paths: -I/home/valera/petsc/cuda/include
>> -I/home/valera/petsc/include -I/home/valera/petsc/include
>> -I/home/valera/petsc/cuda/include -I/home/valera/cusp/
>> -I/usr/local/cuda/include
>> -----------------------------------------
>>
>> Using C linker: mpicc
>> Using Fortran linker: mpifort
>> Using libraries: -Wl,-rpath,/home/valera/petsc/cuda/lib
>> -L/home/valera/petsc/cuda/lib -lpetsc -Wl,-rpath,/home/valera/petsc/cuda/lib
>> -L/home/valera/petsc/cuda/lib -Wl,-rpath,/usr/local/cuda/lib64
>> -L/usr/local/cuda/lib64 -Wl,-rpath,/usr/lib64/openmpi/lib
>> -L/usr/lib64/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5
>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lHYPRE -lflapack -lfblas -lm
>> -lcufft -lcublas -lcudart -lcusparse -lX11 -lstdc++ -ldl -lmpi_usempi
>> -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath
>> -lpthread -lstdc++ -ldl
>> -----------------------------------------
>>
>>
>>
>> Thanks for your help,
>>
>> Manuel
>>
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180314/e28f37f2/attachment-0001.html>
More information about the petsc-users
mailing list