[petsc-dev] cusparse error

Junchao Zhang junchao.zhang at gmail.com
Wed Dec 9 20:01:00 CST 2020


Could be GPU resource competition. Note this test uses nsize=8.
--Junchao Zhang


On Wed, Dec 9, 2020 at 7:15 PM Mark Adams <mfadams at lbl.gov> wrote:

> And this is a Cuda 11 complex build:
> https://gitlab.com/petsc/petsc/-/jobs/901108135
>
> On Wed, Dec 9, 2020 at 8:11 PM Mark Adams <mfadams at lbl.gov> wrote:
>
>> My MR is generating an error. Tee error message says cusparse has not
>> been initialized, so I added a cuparse init, but I still get the error
>> (appended, *adams/landau-gpu-assembly
>> <https://gitlab.com/petsc/petsc/-/tree/adams/landau-gpu-assembly>*).
>> Any ideas would be appreciated.
>>
>> I am trying to reproduce this on Summit and it fails with a timeout limit
>> of 60s, but it only runs for a few seconds (see timers). Any ideas?
>>
>> 19:58 adams/landau-gpu-assembly= ~/petsc$ make -f gmakefile test
>> search='ksp_ksp_tutorials-ex71_bddc_cusparse'
>> PETSC_ARCH=arch-summit-opt-gnu-cuda
>> Using MAKEFLAGS: PETSC_ARCH=arch-summit-opt-gnu-cuda
>> search=ksp_ksp_tutorials-ex71_bddc_cusparse
>>         TEST
>> arch-summit-opt-gnu-cuda/tests/counts/ksp_ksp_tutorials-ex71_bddc_cusparse.counts
>> not ok ksp_ksp_tutorials-ex71_bddc_cusparse # Exceeded timeout limit of
>> 60 s
>>  ok ksp_ksp_tutorials-ex71_bddc_cusparse # SKIP Command failed so no diff
>>
>> # -------------
>> #   Summary
>> # -------------
>> # FAILED ksp_ksp_tutorials-ex71_bddc_cusparse
>> # success 0/1 tests (0.0%)
>> # failed 1/1 tests (100.0%)
>> # todo 0/1 tests (0.0%)
>> # skip 0/1 tests (0.0%)
>> #
>> # Wall clock time for tests: 3 sec
>> # Approximate CPU time (not incl. build time): 3.14 sec
>>
>>
>>
>>
>>
>> not ok ksp_ksp_tutorials-ex71_bddc_cusparse # Error code: 201
>> 2391 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2391># [1]PETSC
>> ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> 2392 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2392># [1]PETSC
>> ERROR: GPU error
>> 2393 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2393># [1]PETSC
>> ERROR: cuSPARSE error 1 (CUSPARSE_STATUS_NOT_INITIALIZED) : initialization
>> error
>> 2394 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2394># [1]PETSC
>> ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for
>> trouble shooting.
>> 2395 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2395># [1]PETSC
>> ERROR: Petsc Development GIT revision: v3.14.2-85-gd60087d GIT Date:
>> 2020-12-09 17:49:59 -0500
>> 2396 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2396># [1]PETSC
>> ERROR: ../ex71 on a named frog by petsc Wed Dec 9 18:41:10 2020
>> 2397 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2397># [1]PETSC
>> ERROR: Configure options --package-prefix-hash=/home/petsc/petsc-hash-pkgs
>> --with-make-test-np=2 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" CXXOPTFLAGS="-g
>> -O" --with-scalar-type=complex --with-precision=single
>> --with-cuda-dir=/usr/local/cuda-11.0 PETSC_ARCH=arch-ci-linux-cuda11-complex
>> 2398 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2398># [1]PETSC
>> ERROR: #1 MatConvert_SeqAIJ_SeqAIJCUSPARSE() line 2708 in
>> /home/petsc/builds/KFnbdjNX/0/petsc/petsc/src/mat/impls/aij/seq/seqcusparse/
>> aijcusparse.cu
>> 2399 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2399># [1]PETSC
>> ERROR: #2 MatCreate_SeqAIJCUSPARSE() line 2739 in
>> /home/petsc/builds/KFnbdjNX/0/petsc/petsc/src/mat/impls/aij/seq/seqcusparse/
>> aijcusparse.cu
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20201209/5d831cb2/attachment.html>


More information about the petsc-dev mailing list