[petsc-users] Using PETSc with GPU supported SuperLU_Dist
Abhyankar, Shrirang G
shrirang.abhyankar at pnnl.gov
Sun Feb 23 15:33:11 CST 2020
I was using CUDA v10.2. Switching to 9.2 gives a clean make test.
Thanks,
Shri
From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of "Abhyankar, Shrirang G via petsc-users" <petsc-users at mcs.anl.gov>
Reply-To: "Abhyankar, Shrirang G" <shrirang.abhyankar at pnnl.gov>
Date: Sunday, February 23, 2020 at 3:10 PM
To: petsc-users <petsc-users at mcs.anl.gov>, Junchao Zhang <jczhang at mcs.anl.gov>
Subject: Re: [petsc-users] Using PETSc with GPU supported SuperLU_Dist
I am getting an error now for CUDA driver version. Any suggestions?
petsc:maint$ make test
Running test examples to verify correct installation
Using PETSC_DIR=/people/abhy245/software/petsc and PETSC_ARCH=debug-mode-newell
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI process
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Error in system call
[0]PETSC ERROR: error in cudaSetDevice CUDA driver version is insufficient for CUDA runtime version
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.12.4, unknown
[0]PETSC ERROR: ./ex19 on a debug-mode-newell named newell01.pnl.gov by abhy245 Sun Feb 23 12:49:55 2020
[0]PETSC ERROR: Configure options --download-fblaslapack --download-make --download-metis --download-parmetis --download-scalapack --download-suitesparse --download-superlu_dist-gpu=1 --download-superlu_dist=1 --with-cc=mpicc --with-clanguage=c++ --with-cuda-dir=/share/apps/cuda/10.2 --with-cuda=1 --with-cxx-dialect=C++11 --with-cxx=mpicxx --with-fc=mpif77 --with-openmp=1 PETSC_ARCH=debug-mode-newell
[0]PETSC ERROR: #1 PetscCUDAInitialize() line 261 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[0]PETSC ERROR: #2 PetscOptionsCheckInitial_Private() line 652 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[0]PETSC ERROR: #3 PetscInitialize() line 1010 in /qfs/people/abhy245/software/petsc/src/sys/objects/pinit.c
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[46518,1],0]
Exit code: 88
--------------------------------------------------------------------------
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Error in system call
[1]PETSC ERROR: [0]PETSC ERROR: Error in system call
[0]PETSC ERROR: error in cudaGetDeviceCount CUDA driver version is insufficient for CUDA runtime version
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
error in cudaGetDeviceCount CUDA driver version is insufficient for CUDA runtime version
[1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.12.4, unknown
[0]PETSC ERROR: ./ex19 on a debug-mode-newell named newell01.pnl.gov by abhy245 Sun Feb 23 12:49:57 2020
[0]PETSC ERROR: Configure options --download-fblaslapack --download-make --download-metis --download-parmetis --download-scalapack --download-suitesparse --download-superlu_dist-gpu=1 --download-superlu_dist=1 --with-cc=mpicc --with-clanguage=c++ --with-cuda-dir=/share/apps/cuda/10.2 --with-cuda=1 --with-cxx-dialect=C++11 --with-cxx=mpicxx --with-fc=mpif77 --with-openmp=1 PETSC_ARCH=debug-mode-newell
[0]PETSC ERROR: #1 PetscCUDAInitialize() line 254 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[0]PETSC ERROR: #2 PetscOptionsCheckInitial_Private() line 652 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[0]PETSC ERROR: #3 PetscInitialize() line 1010 in /qfs/people/abhy245/software/petsc/src/sys/objects/pinit.c
Petsc Release Version 3.12.4, unknown
[1]PETSC ERROR: ./ex19 on a debug-mode-newell named newell01.pnl.gov by abhy245 Sun Feb 23 12:49:57 2020
[1]PETSC ERROR: Configure options --download-fblaslapack --download-make --download-metis --download-parmetis --download-scalapack --download-suitesparse --download-superlu_dist-gpu=1 --download-superlu_dist=1 --with-cc=mpicc --with-clanguage=c++ --with-cuda-dir=/share/apps/cuda/10.2 --with-cuda=1 --with-cxx-dialect=C++11 --with-cxx=mpicxx --with-fc=mpif77 --with-openmp=1 PETSC_ARCH=debug-mode-newell
[1]PETSC ERROR: #1 PetscCUDAInitialize() line 254 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[1]PETSC ERROR: #2 PetscOptionsCheckInitial_Private() line 652 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[1]PETSC ERROR: #3 PetscInitialize() line 1010 in /qfs/people/abhy245/software/petsc/src/sys/objects/pinit.c
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[46522,1],0]
Exit code: 88
--------------------------------------------------------------------------
1,2c1,21
< lid velocity = 0.0025, prandtl # = 1., grashof # = 1.
< Number of SNES iterations = 2
---
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Error in system call
> [0]PETSC ERROR: error in cudaSetDevice CUDA driver version is insufficient for CUDA runtime version
> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.12.4, unknown
> [0]PETSC ERROR: ./ex19 on a debug-mode-newell named newell01.pnl.gov by abhy245 Sun Feb 23 12:50:00 2020
> [0]PETSC ERROR: Configure options --download-fblaslapack --download-make --download-metis --download-parmetis --download-scalapack --download-suitesparse --download-superlu_dist-gpu=1 --download-superlu_dist=1 --with-cc=mpicc --with-clanguage=c++ --with-cuda-dir=/share/apps/cuda/10.2 --with-cuda=1 --with-cxx-dialect=C++11 --with-cxx=mpicxx --with-fc=mpif77 --with-openmp=1 PETSC_ARCH=debug-mode-newell
> [0]PETSC ERROR: #1 PetscCUDAInitialize() line 261 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
> [0]PETSC ERROR: #2 PetscOptionsCheckInitial_Private() line 652 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
> [0]PETSC ERROR: #3 PetscInitialize() line 1010 in /qfs/people/abhy245/software/petsc/src/sys/objects/pinit.c
> --------------------------------------------------------------------------
> Primary job terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec detected that one or more processes exited with non-zero status, thus causing
> the job to be terminated. The first process to do so was:
>
> Process name: [[46545,1],0]
> Exit code: 88
> --------------------------------------------------------------------------
/people/abhy245/software/petsc/src/snes/examples/tutorials
Possible problem with ex19 running with superlu_dist, diffs above
=========================================
Possible error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Error in system call
[0]PETSC ERROR: error in cudaSetDevice CUDA driver version is insufficient for CUDA runtime version
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.12.4, unknown
[0]PETSC ERROR: ./ex5f on a debug-mode-newell named newell01.pnl.gov by abhy245 Sun Feb 23 12:50:04 2020
[0]PETSC ERROR: Configure options --download-fblaslapack --download-make --download-metis --download-parmetis --download-scalapack --download-suitesparse --download-superlu_dist-gpu=1 --download-superlu_dist=1 --with-cc=mpicc --with-clanguage=c++ --with-cuda-dir=/share/apps/cuda/10.2 --with-cuda=1 --with-cxx-dialect=C++11 --with-cxx=mpicxx --with-fc=mpif77 --with-openmp=1 PETSC_ARCH=debug-mode-newell
[0]PETSC ERROR: #1 PetscCUDAInitialize() line 261 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[0]PETSC ERROR: #2 PetscOptionsCheckInitial_Private() line 652 in /qfs/people/abhy245/software/petsc/src/sys/objects/init.c
[0]PETSC ERROR: PetscInitialize:Checking initial options
Unable to initialize PETSc
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 0 on
node newell01 exiting improperly. There are three reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.
This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------
Completed test examples
From: Satish Balay <balay at mcs.anl.gov>
Reply-To: petsc-users <petsc-users at mcs.anl.gov>
Date: Saturday, February 22, 2020 at 9:00 PM
To: Junchao Zhang <jczhang at mcs.anl.gov>
Cc: "Abhyankar, Shrirang G" <shrirang.abhyankar at pnnl.gov>, petsc-users <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Using PETSc with GPU supported SuperLU_Dist
The fix is now in both maint and master
https://gitlab.com/petsc/petsc/-/merge_requests/2555
Satish
On Sat, 22 Feb 2020, Junchao Zhang via petsc-users wrote:
We met the error before and knew why. Will fix it soon.
--Junchao Zhang
On Sat, Feb 22, 2020 at 11:43 AM Abhyankar, Shrirang G via petsc-users <
petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
> Thanks, Satish. Configure and make go through fine. Getting an undefined
> reference error for VecGetArrayWrite_SeqCUDA.
>
>
>
> Shri
>
> *From: *Satish Balay <balay at mcs.anl.gov<mailto:balay at mcs.anl.gov>>
> *Reply-To: *petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
> *Date: *Saturday, February 22, 2020 at 8:25 AM
> *To: *"Abhyankar, Shrirang G" <shrirang.abhyankar at pnnl.gov<mailto:shrirang.abhyankar at pnnl.gov>>
> *Cc: *"petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
> *Subject: *Re: [petsc-users] Using PETSc with GPU supported SuperLU_Dist
>
>
>
> On Sat, 22 Feb 2020, Abhyankar, Shrirang G via petsc-users wrote:
>
>
>
> Hi,
>
> I want to install PETSc with GPU supported SuperLU_Dist. What are the
> configure options I should be using?
>
>
>
>
>
> Shri,
>
>
>
>
>
> if self.framework.argDB['download-superlu_dist-gpu']:
>
> self.cuda = framework.require('config.packages.cuda',self)
>
> self.openmp =
> framework.require('config.packages.openmp',self)
>
> self.deps =
> [self.mpi,self.blasLapack,self.cuda,self.openmp]
>
> <<<<<
>
>
>
> So try:
>
>
>
> --with-cuda=1 --download-superlu_dist=1 --download-superlu_dist-gpu=1
> --with-openmp=1 [and usual MPI, blaslapack]
>
>
>
> Satish
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200223/455540a2/attachment-0001.html>
More information about the petsc-users
mailing list