[petsc-dev] Floating point exception in ex47cu.cu

Eugene Kozlov neoveneficus at gmail.com
Mon Apr 25 09:22:45 CDT 2011


I've checked. Without this option in makefiles 'arch' is set to 'sm_13'.
But I don't remember what I do before it began works.

2011/4/25 Satish Balay <balay at mcs.anl.gov>:
> That would be wierd. Currently configure defaults to sm_10 for single
> precision, and sm_13 for double precision. [I've verified this again]
>
> can you send configure.log [without the cuda-arch option] - to
> petsc-maint?
>
>
> Satish
>
>
> On Mon, 25 Apr 2011, Евгений Козлов wrote:
>
>> Yes, this was the problem. Now it works.
>>
>> Thank you.
>>
>> 2011/4/25 Victor Minden <victorminden at gmail.com>:
>> > Eugene,
>> > Based off of
>> >  Configure options --prefix=/home/kukushkinav
>> > --with-blas-lapack-dir=/opt/intel/composerxe-2011.0.084/mkl
>> > --with-mpi-dir=/opt/intel/impi/4.0.1.007/intel64/bin --with-cuda=1
>> > --with-cusp=1 --with-thrust=1
>> > --with-thrust-dir=/home/kukushkinav/include
>> > --with-cusp-dir=/home/kukushkinav/include
>> > It looks like maybe you're not setting the configure flag
>> > --with-cuda-arch=XXX.  Without this, PETSc uses the nvcc default, which when
>> > I last looked was sm_10.  The problem with this is that sm_10 doesn't
>> > support double-precision.  Maybe this is your problem?  For example, I use
>> > -with-cuda-arch=sm_13, which corresponds to NVIDIA compute capability 1.3
>> > Cheers,
>> > Victor
>> > ---
>> > Victor L. Minden
>> >
>> > Tufts University
>> > School of Engineering
>> > Class of 2012
>> >
>> >
>> > On Fri, Apr 22, 2011 at 12:22 PM, Евгений Козлов <neoveneficus at gmail.com>
>> > wrote:
>> >>
>> >> Hello,
>> >>
>> >> I am interesting in using PETSc for iterative solving sparse linear
>> >> system on multi-GPU systems.
>> >>
>> >> First of all, I compiled PETSc-dev and tried to run some examples,
>> >> which will run on GPU.
>> >>
>> >> I found src/snes/examples/tutorials/ex47cu.cu. It was compiled and
>> >> executed.
>> >>
>> >> Out of original program src/snes/examples/tutorials/ex47cu.cu:
>> >>
>> >> [0]PETSC ERROR: --------------------- Error Message
>> >> ------------------------------------
>> >> [0]PETSC ERROR: Floating point exception!
>> >> [0]PETSC ERROR: User provided compute function generated a Not-a-Number!
>> >> [0]PETSC ERROR:
>> >> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: Petsc Development HG revision:
>> >> d3e10315d68b1dd5481adb2889c7d354880da362  HG Date: Wed Apr 20 21:03:56
>> >> 2011 -0500
>> >> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> >> [0]PETSC ERROR: See docs/index.html for manual pages.
>> >> [0]PETSC ERROR:
>> >> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: ex47cu on a arch-linu named cn03 by kukushkinav Fri
>> >> Apr 22 18:33:32 2011
>> >> [0]PETSC ERROR: Libraries linked from /home/kukushkinav/lib
>> >> [0]PETSC ERROR: Configure run at Thu Apr 21 19:18:22 2011
>> >> [0]PETSC ERROR: Configure options --prefix=/home/kukushkinav
>> >> --with-blas-lapack-dir=/opt/intel/composerxe-2011.0.084/mkl
>> >> --with-mpi-dir=/opt/intel/impi/4.0.1.007/intel64/bin --with-cuda=1
>> >> --with-cusp=1 --with-thrust=1
>> >> --with-thrust-dir=/home/kukushkinav/include
>> >> --with-cusp-dir=/home/kukushkinav/include
>> >> [0]PETSC ERROR:
>> >> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: SNESSolve_LS() line 167 in src/snes/impls/ls/ls.c
>> >> [0]PETSC ERROR: SNESSolve() line 2407 in src/snes/interface/snes.c
>> >> [0]PETSC ERROR: main() line 38 in src/snes/examples/tutorials/ex47cu.cu
>> >> application called MPI_Abort(MPI_COMM_WORLD, 72) - process 0
>> >>
>> >> Then I tried to find the place in source with the problem, I changed
>> >> the function in struct ApplyStencil to
>> >>
>> >> void operator()(Tuple t) { thrust::get<0>(t) = 1; }
>> >>
>> >> Result:
>> >> [0]PETSC ERROR: --------------------- Error Message
>> >> ------------------------------------
>> >> [0]PETSC ERROR: Floating point exception!
>> >> [0]PETSC ERROR: Infinite or not-a-number generated in mdot, entry 0!
>> >> [0]PETSC ERROR:
>> >> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: Petsc Development HG revision:
>> >> d3e10315d68b1dd5481adb2889c7d354880da362  HG Date: Wed Apr 20 21:03:56
>> >> 2011 -0500
>> >> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> >> [0]PETSC ERROR: See docs/index.html for manual pages.
>> >> [0]PETSC ERROR:
>> >> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: ex47cu on a arch-linu named cn11 by kukushkinav Fri
>> >> Apr 22 18:58:04 2011
>> >> [0]PETSC ERROR: Libraries linked from /home/kukushkinav/lib
>> >> [0]PETSC ERROR: Configure run at Thu Apr 21 19:18:22 2011
>> >> [0]PETSC ERROR: Configure options --prefix=/home/kukushkinav
>> >> --with-blas-lapack-dir=/opt/intel/composerxe-2011.0.084/mkl
>> >> --with-mpi-dir=/opt/intel/impi/4.0.1.007/intel64/bin --with-cuda=1
>> >> --with-cusp=1 --with-thrust=1
>> >> --with-thrust-dir=/home/kukushkinav/include
>> >> --with-cusp-dir=/home/kukushkinav/include
>> >> [0]PETSC ERROR:
>> >> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: VecMDot() line 1146 in src/vec/vec/interface/rvector.c
>> >> [0]PETSC ERROR: KSPGMRESClassicalGramSchmidtOrthogonalization() line
>> >> 66 in src/ksp/ksp/impls/gmres/borthog2.c
>> >> [0]PETSC ERROR: GMREScycle() line 161 in src/ksp/ksp/impls/gmres/gmres.c
>> >> [0]PETSC ERROR: KSPSolve_GMRES() line 244 in
>> >> src/ksp/ksp/impls/gmres/gmres.c
>> >> [0]PETSC ERROR: KSPSolve() line 426 in src/ksp/ksp/interface/itfunc.c
>> >> [0]PETSC ERROR: SNES_KSPSolve() line 3107 in src/snes/interface/snes.c
>> >> [0]PETSC ERROR: SNESSolve_LS() line 190 in src/snes/impls/ls/ls.c
>> >> [0]PETSC ERROR: SNESSolve() line 2407 in src/snes/interface/snes.cС
>> >> уважением,
>> >> Евгений
>> >> [0]PETSC ERROR: main() line 38 in src/snes/examples/tutorials/ex47cu.cu
>> >>
>> >> RedHat 5.5, Cuda 3.2
>> >>
>> >> Question1: Is it my problem or a bug of the algorithm?
>> >>
>> >> Question2: Where can I find simple doc or example, which describe how
>> >> to solve sparse linear systems on multi-GPU systems using PETSc?
>> >>
>> >>
>> >> --
>> >> Best regards,
>> >> Eugene
>> >
>> >
>>
>>
>>
>>
>



-- 
С уважением,
Евгений



More information about the petsc-dev mailing list