[petsc-dev] Floating point exception in ex47cu.cu

Евгений Козлов neoveneficus at gmail.com
Mon Apr 25 08:24:20 CDT 2011


Yes, this was the problem. Now it works.

Thank you.

2011/4/25 Victor Minden <victorminden at gmail.com>:
> Eugene,
> Based off of
>  Configure options --prefix=/home/kukushkinav
> --with-blas-lapack-dir=/opt/intel/composerxe-2011.0.084/mkl
> --with-mpi-dir=/opt/intel/impi/4.0.1.007/intel64/bin --with-cuda=1
> --with-cusp=1 --with-thrust=1
> --with-thrust-dir=/home/kukushkinav/include
> --with-cusp-dir=/home/kukushkinav/include
> It looks like maybe you're not setting the configure flag
> --with-cuda-arch=XXX.  Without this, PETSc uses the nvcc default, which when
> I last looked was sm_10.  The problem with this is that sm_10 doesn't
> support double-precision.  Maybe this is your problem?  For example, I use
> -with-cuda-arch=sm_13, which corresponds to NVIDIA compute capability 1.3
> Cheers,
> Victor
> ---
> Victor L. Minden
>
> Tufts University
> School of Engineering
> Class of 2012
>
>
> On Fri, Apr 22, 2011 at 12:22 PM, Евгений Козлов <neoveneficus at gmail.com>
> wrote:
>>
>> Hello,
>>
>> I am interesting in using PETSc for iterative solving sparse linear
>> system on multi-GPU systems.
>>
>> First of all, I compiled PETSc-dev and tried to run some examples,
>> which will run on GPU.
>>
>> I found src/snes/examples/tutorials/ex47cu.cu. It was compiled and
>> executed.
>>
>> Out of original program src/snes/examples/tutorials/ex47cu.cu:
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Floating point exception!
>> [0]PETSC ERROR: User provided compute function generated a Not-a-Number!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Petsc Development HG revision:
>> d3e10315d68b1dd5481adb2889c7d354880da362  HG Date: Wed Apr 20 21:03:56
>> 2011 -0500
>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [0]PETSC ERROR: See docs/index.html for manual pages.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: ex47cu on a arch-linu named cn03 by kukushkinav Fri
>> Apr 22 18:33:32 2011
>> [0]PETSC ERROR: Libraries linked from /home/kukushkinav/lib
>> [0]PETSC ERROR: Configure run at Thu Apr 21 19:18:22 2011
>> [0]PETSC ERROR: Configure options --prefix=/home/kukushkinav
>> --with-blas-lapack-dir=/opt/intel/composerxe-2011.0.084/mkl
>> --with-mpi-dir=/opt/intel/impi/4.0.1.007/intel64/bin --with-cuda=1
>> --with-cusp=1 --with-thrust=1
>> --with-thrust-dir=/home/kukushkinav/include
>> --with-cusp-dir=/home/kukushkinav/include
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: SNESSolve_LS() line 167 in src/snes/impls/ls/ls.c
>> [0]PETSC ERROR: SNESSolve() line 2407 in src/snes/interface/snes.c
>> [0]PETSC ERROR: main() line 38 in src/snes/examples/tutorials/ex47cu.cu
>> application called MPI_Abort(MPI_COMM_WORLD, 72) - process 0
>>
>> Then I tried to find the place in source with the problem, I changed
>> the function in struct ApplyStencil to
>>
>> void operator()(Tuple t) { thrust::get<0>(t) = 1; }
>>
>> Result:
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Floating point exception!
>> [0]PETSC ERROR: Infinite or not-a-number generated in mdot, entry 0!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Petsc Development HG revision:
>> d3e10315d68b1dd5481adb2889c7d354880da362  HG Date: Wed Apr 20 21:03:56
>> 2011 -0500
>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [0]PETSC ERROR: See docs/index.html for manual pages.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: ex47cu on a arch-linu named cn11 by kukushkinav Fri
>> Apr 22 18:58:04 2011
>> [0]PETSC ERROR: Libraries linked from /home/kukushkinav/lib
>> [0]PETSC ERROR: Configure run at Thu Apr 21 19:18:22 2011
>> [0]PETSC ERROR: Configure options --prefix=/home/kukushkinav
>> --with-blas-lapack-dir=/opt/intel/composerxe-2011.0.084/mkl
>> --with-mpi-dir=/opt/intel/impi/4.0.1.007/intel64/bin --with-cuda=1
>> --with-cusp=1 --with-thrust=1
>> --with-thrust-dir=/home/kukushkinav/include
>> --with-cusp-dir=/home/kukushkinav/include
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: VecMDot() line 1146 in src/vec/vec/interface/rvector.c
>> [0]PETSC ERROR: KSPGMRESClassicalGramSchmidtOrthogonalization() line
>> 66 in src/ksp/ksp/impls/gmres/borthog2.c
>> [0]PETSC ERROR: GMREScycle() line 161 in src/ksp/ksp/impls/gmres/gmres.c
>> [0]PETSC ERROR: KSPSolve_GMRES() line 244 in
>> src/ksp/ksp/impls/gmres/gmres.c
>> [0]PETSC ERROR: KSPSolve() line 426 in src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: SNES_KSPSolve() line 3107 in src/snes/interface/snes.c
>> [0]PETSC ERROR: SNESSolve_LS() line 190 in src/snes/impls/ls/ls.c
>> [0]PETSC ERROR: SNESSolve() line 2407 in src/snes/interface/snes.cС
>> уважением,
>> Евгений
>> [0]PETSC ERROR: main() line 38 in src/snes/examples/tutorials/ex47cu.cu
>>
>> RedHat 5.5, Cuda 3.2
>>
>> Question1: Is it my problem or a bug of the algorithm?
>>
>> Question2: Where can I find simple doc or example, which describe how
>> to solve sparse linear systems on multi-GPU systems using PETSc?
>>
>>
>> --
>> Best regards,
>> Eugene
>
>



-- 
С уважением,
Евгений



More information about the petsc-dev mailing list