[petsc-users] ex52_integrateElement.cu

Matthew Knepley knepley at gmail.com
Wed Mar 28 13:23:37 CDT 2012


On Wed, Mar 28, 2012 at 1:14 PM, David Fuentes <fuentesdt at gmail.com> wrote:

> thanks! its running, but I seem to be getting different answer for cpu/gpu
> ?
> i had some floating point problems on this Tesla M2070 gpu before, but
> adding the '-arch=sm_20' option seemed to fix it last time.
>
>
> is the assembly in single precision ? my 'const PetscReal jacobianInverse'
> being passed in are doubles
>

Yep, that is the problem. I have not tested anything in double. I have not
decided exactly how to handle it. Can you
make another ARCH --with-precision=single and make sure it works, and then
we can fix the double issue?

  Thanks,

     Matt


> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch -gpu
> GPU layout grid(1,2,1) block(3,1,1) with 1 batches
>  N_t: 3, N_cb: 1
> Residual:
> Vector Object: 1 MPI processes
>   type: seq
> 0
> 755712
> 0
> -58720
> -2953.13
> 0.375
> 1.50323e+07
> 0.875
> 0
> SCRGP2$
> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch
> Residual:
> Vector Object: 1 MPI processes
>   type: seq
> -0.25
> -0.5
> 0.25
> -0.5
> -1
> 0.5
> 0.25
> 0.5
> 0.75
>
>
>
>
>
> On Wed, Mar 28, 2012 at 11:55 AM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Wed, Mar 28, 2012 at 11:45 AM, David Fuentes <fuentesdt at gmail.com>wrote:
>>
>>> The example seems to be running on cpu with '-batch' but i'm getting
>>> errors in line 323 with the '-gpu' option
>>>
>>> [0]PETSC ERROR: IntegrateElementBatchGPU() line 323 in
>>> src/snes/examples/tutorials/ex52_integrateElement.cu
>>>
>>> should this possibly be PetscScalar ?
>>>
>>
>> No.
>>
>>
>>> -  ierr = cudaMalloc((void**) &d_coefficients,         Ne*N_bt *
>>> sizeof(float));CHKERRQ(ierr);
>>> +  ierr = cudaMalloc((void**) &d_coefficients,         Ne*N_bt *
>>> sizeof(PetscScalar));CHKERRQ(ierr);
>>>
>>>
>>> SCRGP2$ python
>>> $PETSC_DIR/bin/pythonscripts/PetscGenerateFEMQuadrature.py 2 1 1 1
>>> laplacian ex52.h
>>> ['/opt/apps/PETSC/petsc-dev/bin/pythonscripts/PetscGenerateFEMQuadrature.py',
>>> '2', '1', '1', '1', 'laplacian', 'ex52.h']
>>> 2 1 1 1 laplacian
>>> [{(-1.0, -1.0): [(1.0, ())]}, {(1.0, -1.0): [(1.0, ())]}, {(-1.0, 1.0):
>>> [(1.0, ())]}]
>>> {0: {0: [0], 1: [1], 2: [2]}, 1: {0: [], 1: [], 2: []}, 2: {0: []}}
>>> Perm: [0, 1, 2]
>>> Creating /home/fuentes/snestutorial/ex52.h
>>> Creating /home/fuentes/snestutorial/ex52_gpu.h
>>> [{(-1.0, -1.0): [(1.0, ())]}, {(1.0, -1.0): [(1.0, ())]}, {(-1.0, 1.0):
>>> [(1.0, ())]}]
>>> {0: {0: [0], 1: [1], 2: [2]}, 1: {0: [], 1: [], 2: []}, 2: {0: []}}
>>> Perm: [0, 1, 2]
>>> Creating /home/fuentes/snestutorial/ex52_gpu_inline.h
>>>
>>> SCRGP2$ make ex52
>>> /usr/bin/mpicxx -o ex52.o -c -O0 -g -fPIC
>>> -I/opt/apps/PETSC/petsc-dev/include
>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/include
>>> -I/opt/apps/cuda/4.1/cuda/include -I/opt/apps/PETSC/petsc-dev/include/sieve
>>> -I/opt/MATLAB/R2011a/extern/include -I/usr/include
>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/cbind/include
>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/forbind/include
>>> -I/usr/include/mpich2 -D__INSDIR__=src/snes/examples/tutorials/ ex52.c
>>> nvcc -O0 -g -arch=sm_20  -c --compiler-options="-O0 -g   -fPIC
>>>  -I/opt/apps/PETSC/petsc-dev/include
>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/include
>>> -I/opt/apps/cuda/4.1/cuda/include -I/opt/apps/PETSC/petsc-dev/include/sieve
>>> -I/opt/MATLAB/R2011a/extern/include -I/usr/include
>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/cbind/include
>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/forbind/include
>>> -I/usr/include/mpich2    -D__INSDIR__=src/snes/examples/tutorials/"
>>>  ex52_integrateElement.cu
>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(13): warning: variable "weights_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(28): warning: variable "BasisDerivatives_0" was
>>> declared but never referenced
>>>
>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(13): warning: variable "weights_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but
>>> never referenced
>>>
>>> ex52_gpu_inline.h(28): warning: variable "BasisDerivatives_0" was
>>> declared but never referenced
>>>
>>> /usr/bin/mpicxx -O0 -g   -o ex52 ex52.o ex52_integrateElement.o
>>>  -Wl,-rpath,/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/lib
>>> -L/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/lib  -lpetsc
>>> -Wl,-rpath,/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/lib
>>> -ltriangle -lX11 -lpthread -lsuperlu_dist_3.0 -lcmumps -ldmumps -lsmumps
>>> -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lscalapack -lblacs
>>> -Wl,-rpath,/opt/apps/cuda/4.1/cuda/lib64 -L/opt/apps/cuda/4.1/cuda/lib64
>>> -lcufft -lcublas -lcudart -lcusparse
>>> -Wl,-rpath,/opt/MATLAB/R2011a/sys/os/glnxa64:/opt/MATLAB/R2011a/bin/glnxa64:/opt/MATLAB/R2011a/extern/lib/glnxa64
>>> -L/opt/MATLAB/R2011a/bin/glnxa64 -L/opt/MATLAB/R2011a/extern/lib/glnxa64
>>> -leng -lmex -lmx -lmat -lut -licudata -licui18n -licuuc
>>> -Wl,-rpath,/opt/epd-7.1-2-rh5-x86_64/lib -L/opt/epd-7.1-2-rh5-x86_64/lib
>>> -lmkl_rt -lmkl_intel_thread -lmkl_core -liomp5 -lexoIIv2for -lexodus
>>> -lnetcdf_c++ -lnetcdf -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.4.3
>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.4.3 -ldl -lmpich -lopa -lpthread -lrt
>>> -lgcc_s -lmpichf90 -lgfortran -lm -lm -lmpichcxx -lstdc++ -lmpichcxx
>>> -lstdc++ -ldl -lmpich -lopa -lpthread -lrt -lgcc_s -ldl
>>> /bin/rm -f ex52.o ex52_integrateElement.o
>>>
>>>
>>> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch
>>> Residual:
>>> Vector Object: 1 MPI processes
>>>   type: seq
>>> -0.25
>>> -0.5
>>> 0.25
>>> -0.5
>>> -1
>>> 0.5
>>> 0.25
>>> 0.5
>>> 0.75
>>> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch -gpu
>>> [0]PETSC ERROR: IntegrateElementBatchGPU() line 323 in
>>> src/snes/examples/tutorials/ex52_integrateElement.cu
>>> [0]PETSC ERROR: FormFunctionLocalBatch() line 679 in
>>> src/snes/examples/tutorials/ex52.c
>>> [0]PETSC ERROR: SNESDMComplexComputeFunction() line 431 in
>>> src/snes/utils/damgsnes.c
>>> [0]PETSC ERROR: main() line 1021 in src/snes/examples/tutorials/ex52.c
>>> application called MPI_Abort(MPI_COMM_WORLD, 35) - process 0
>>>
>>
>> This is failing on cudaMalloc(), which means your card is not available
>> for running. Are you trying to run on your laptop?
>> If so, applications like Preview can lock up the GPU. I know of no way to
>> test this in CUDA while running. I just close
>> apps until it runs.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>>
>>> On Tue, Mar 27, 2012 at 8:37 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>>
>>>> On Tue, Mar 27, 2012 at 2:10 PM, Blaise Bourdin <bourdin at lsu.edu>wrote:
>>>>
>>>>>
>>>>> On Mar 27, 2012, at 1:23 PM, Matthew Knepley wrote:
>>>>>
>>>>> On Tue, Mar 27, 2012 at 12:58 PM, David Fuentes <fuentesdt at gmail.com>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I had a question about the status of example 52.
>>>>>>
>>>>>>
>>>>>> http://petsc.cs.iit.edu/petsc/petsc-dev/file/a8e2f2c19319/src/snes/examples/tutorials/ex52.c
>>>>>>
>>>>>> http://petsc.cs.iit.edu/petsc/petsc-dev/file/a8e2f2c19319/src/snes/examples/tutorials/ex52_integrateElement.cu
>>>>>>
>>>>>>
>>>>>> Can this example be used with a DM object created from an
>>>>>> unstructured exodusII mesh, DMMeshCreateExodus, And the FEM assembly done
>>>>>> on GPU ?
>>>>>>
>>>>>
>>>>> 1) I have pushed many more tests for it now. They can be run using the
>>>>> Python build system
>>>>>
>>>>>   ./config/builder2.py check src/snes/examples/tutorials/ex52.c
>>>>>
>>>>>   in fact, you can build any set of files this way.
>>>>>
>>>>> 2) The Exodus creation has to be converted to DMComplex from DMMesh.
>>>>> That should not take me very long. Blaise maintains that
>>>>>      so maybe there will be help :) You will just replace
>>>>> DMComplexCreateBoxMesh() with DMComplexCreateExodus(). If you request
>>>>>      it, I will bump it up the list.
>>>>>
>>>>>
>>>>> DMMeshCreateExodusNG is much more flexible than DMMeshCreateExodus in
>>>>> that it can read meshes with multiple element types and should have a much
>>>>> lower memory footprint. The code should be fairly easy to read. you can
>>>>> email me directly if you have specific questions. I had looked at creating
>>>>> a DMComplex and it did not look too difficult, as long as interpolation is
>>>>> not needed. I have plans to write DMComplexCreateExodus, but haven't had
>>>>> time too so far. Updating the Vec viewers and readers may be a bit more
>>>>> involved. In perfect world, one would write an EXODUS viewer following the
>>>>> lines of the VTK and HDF5 ones.
>>>>>
>>>>
>>>> David and Blaise,  I have converted this function, now
>>>> DMComplexCreateExodus(). Its not tested, but I think
>>>> Blaise has some stuff we can use to test it.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Blaise
>>>>>
>>>>>
>>>>>
>>>>> Let me know if you can run the tests.
>>>>>
>>>>>   Thanks
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>>
>>>>>  --
>>>>> Department of Mathematics and Center for Computation & Technology
>>>>> Louisiana State University, Baton Rouge, LA 70803, USA
>>>>> Tel. +1 (225) 578 1612, Fax  +1 (225) 578 4276
>>>>> http://www.math.lsu.edu/~bourdin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120328/a1ddcbe3/attachment-0001.htm>


More information about the petsc-users mailing list