[petsc-users] Error using GPU in Fortran code

Ramoni Z. Sedano Azevedo ramoni.zsedano at gmail.com
Thu Aug 31 12:21:03 CDT 2023


Thank you all for the answers.

I've just started in a group where the code has been running for some time
on the CPU and we started trying to run it on the GPU to see a processing
gain.
I'm going to talk here about the points you've already raised.

Thank you very much!


Em qui., 31 de ago. de 2023 às 00:02, Barry Smith <bsmith at petsc.dev>
escreveu:

>
>   Yikes, sorry I missed that the first run was CPU and the second GPU.
>
>    The run on the CPU is indicative of a very bad preconditioner. It
> doesn't really converge. When the true residual norm jumps by a factor of
> 10^3 at the first iteration, this means the ILU preconditioner is just not
> appropriate or reasonable. The "convergence" of the preconditioned residual
> norm is meaningless.
>
> >   0 KSP preconditioned resid norm 1.236208833927e-08 true resid norm
>> 1.413045088306e-03 ||r(i)||/||b|| 3.745397377137e+01
>> >   1 KSP preconditioned resid norm 1.664973208594e-10 true resid norm
>> 3.463939828700e+00 ||r(i)||/||b|| 9.181470043910e+04
>> >   2 KSP preconditioned resid norm 8.366983092820e-14 true resid norm
>> 9.171051852915e-02 ||r(i)||/||b|| 2.430866066466e+03
>> >   3 KSP preconditioned resid norm 1.386354386207e-14 true resid norm
>> 1.905770367881e-02 ||r(i)||/||b|| 5.051408052270e+02
>> >   4 KSP preconditioned resid norm 4.635883581096e-15 true resid norm
>> 7.285180695640e-03 ||r(i)||/||b|| 1.930999717931e+02
>> >   5 KSP preconditioned resid norm 1.974093227402e-15 true resid norm
>> 2.953370060898e-03 ||r(i)||/||b|| 7.828161020018e+01
>> >   6 KSP preconditioned resid norm 1.182781787023e-15 true resid norm
>> 2.288756945462e-03 ||r(i)||/||b|| 6.066546871987e+01
>> >   7 KSP preconditioned resid norm 6.221244366707e-16 true resid norm
>> 1.263339414861e-03 ||r(i)||/||b|| 3.348589631014e+01
>
>
>    I won't worry about the GPU behavior (it is just due to slightly
> different numerical computations on the GPU and not surprising.)
>
>    You need to use a different preconditioner, even on the CPU.
>
>
> On Aug 30, 2023, at 9:51 PM, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>
>
>
> On Wed, Aug 30, 2023 at 8:46 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>    What convergence do you get without the GPU matrix and vector
>> operations?
>
> Barry, that was in the original email
>
>>
>>
>>    Can you try the GPU run with -ksp_type gmres -ksp_pc_side right ?
>>
>>    For certain problems, ILU can produce catastrophically bad
>> preconditioners.
>>    Barry
>>
>>
>>
>> > On Aug 30, 2023, at 4:41 PM, Ramoni Z. Sedano Azevedo <
>> ramoni.zsedano at gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > I'm executing a code in Fortran using PETSc with MPI via CPU and I
>> would like to execute it using GPU.
>> > PETSc is configured as follows:
>> > ./configure \
>> >  --prefix=${PWD}/installdir \
>> >  --with-fortran \
>> >  --with-fortran-kernels=true \
>> >  --with-cuda \
>> >  --download-fblaslapack \
>> >  --with-scalar-type=complex \
>> >  --with-precision=double \
>> >  --with-debugging=0 \
>> >  --with-x=0 \
>> >  --with-gnu-compilers=1 \
>> >  --with-cc=mpicc \
>> >  --with-cxx=mpicxx \
>> >  --with-fc=mpif90 \
>> >  --with-make-exec=make
>> >
>> > The parameters for using MPI on CPU are:
>> > mpirun -np $ntasks ./${executable} \
>> >  -A_mat_type mpiaij \
>> >  -P_mat_type mpiaij \
>> >  -em_ksp_monitor_true_residual \
>> >  -em_ksp_type bcgs \
>> >  -em_pc_type bjacobi \
>> >  -em_sub_pc_type ilu \
>> >  -em_sub_pc_factor_levels 3 \
>> >  -em_sub_pc_factor_fill 6 \
>> >  < ./Parameters.inp
>> >
>> > Code output:
>> > Solving for Hz fields
>> >  bnorm   3.7727507818834821E-005
>> >  xnorm   2.3407405211699372E-016
>> >   Residual norms for em_ solve.
>> >   0 KSP preconditioned resid norm 1.236208833927e-08 true resid norm
>> 1.413045088306e-03 ||r(i)||/||b|| 3.745397377137e+01
>> >   1 KSP preconditioned resid norm 1.664973208594e-10 true resid norm
>> 3.463939828700e+00 ||r(i)||/||b|| 9.181470043910e+04
>> >   2 KSP preconditioned resid norm 8.366983092820e-14 true resid norm
>> 9.171051852915e-02 ||r(i)||/||b|| 2.430866066466e+03
>> >   3 KSP preconditioned resid norm 1.386354386207e-14 true resid norm
>> 1.905770367881e-02 ||r(i)||/||b|| 5.051408052270e+02
>> >   4 KSP preconditioned resid norm 4.635883581096e-15 true resid norm
>> 7.285180695640e-03 ||r(i)||/||b|| 1.930999717931e+02
>> >   5 KSP preconditioned resid norm 1.974093227402e-15 true resid norm
>> 2.953370060898e-03 ||r(i)||/||b|| 7.828161020018e+01
>> >   6 KSP preconditioned resid norm 1.182781787023e-15 true resid norm
>> 2.288756945462e-03 ||r(i)||/||b|| 6.066546871987e+01
>> >   7 KSP preconditioned resid norm 6.221244366707e-16 true resid norm
>> 1.263339414861e-03 ||r(i)||/||b|| 3.348589631014e+01
>> >   8 KSP preconditioned resid norm 3.800488678870e-16 true resid norm
>> 9.015738978063e-04 ||r(i)||/||b|| 2.389699054959e+01
>> >   9 KSP preconditioned resid norm 2.498733213989e-16 true resid norm
>> 7.194509577987e-04 ||r(i)||/||b|| 1.906966559396e+01
>> >  10 KSP preconditioned resid norm 1.563017112250e-16 true resid norm
>> 5.055208317846e-04 ||r(i)||/||b|| 1.339926385310e+01
>> >  11 KSP preconditioned resid norm 8.733803057628e-17 true resid norm
>> 3.171941303660e-04 ||r(i)||/||b|| 8.407502872682e+00
>> >  12 KSP preconditioned resid norm 4.907010803529e-17 true resid norm
>> 1.868311755294e-04 ||r(i)||/||b|| 4.952120782177e+00
>> >  13 KSP preconditioned resid norm 2.214070343700e-17 true resid norm
>> 8.760421740830e-05 ||r(i)||/||b|| 2.322025028236e+00
>> >  14 KSP preconditioned resid norm 1.333171674446e-17 true resid norm
>> 5.984548368534e-05 ||r(i)||/||b|| 1.586255948119e+00
>> >  15 KSP preconditioned resid norm 7.696778066646e-18 true resid norm
>> 3.786809196913e-05 ||r(i)||/||b|| 1.003726303656e+00
>> >  16 KSP preconditioned resid norm 3.863008301366e-18 true resid norm
>> 1.284864871601e-05 ||r(i)||/||b|| 3.405644702988e-01
>> >  17 KSP preconditioned resid norm 2.061402843494e-18 true resid norm
>> 1.054741071688e-05 ||r(i)||/||b|| 2.795681805311e-01
>> >  18 KSP preconditioned resid norm 1.062033155108e-18 true resid norm
>> 3.992776343462e-06 ||r(i)||/||b|| 1.058319664960e-01
>> >  converged reason            2
>> >  total number of relaxations           18
>> >  ========================================
>> >
>> > The parameters for GPU usage are:
>> > mpirun -np $ntasks ./${executable} \
>> >  -A_mat_type aijcusparse \
>> >  -P_mat_type aijcusparse \
>> >  -vec_type cuda \
>> >  -use_gpu_aware_mpi 0 \
>> >  -em_ksp_monitor_true_residual \
>> >  -em_ksp_type bcgs \
>> >  -em_pc_type bjacobi \
>> >  -em_sub_pc_type ilu \
>> >  -em_sub_pc_factor_levels 3 \
>> >  -em_sub_pc_factor_fill 6 \
>> >  < ./Parameters.inp
>> >
>> > Code output:
>> > Solving for Hz fields
>> >  bnorm   3.7727507818834821E-005
>> >  xnorm   2.3407405211699372E-016
>> >   Residual norms for em_ solve.
>> >   0 KSP preconditioned resid norm 1.236220954395e-08 true resid norm
>> 3.772750781883e-05 ||r(i)||/||b|| 1.000000000000e+00
>> >   1 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
>> 3.772750781883e-05 ||r(i)||/||b|| 1.000000000000e+00
>> >  converged reason            3
>> >  total number of relaxations            1
>> >  ========================================
>> >
>> > Clearly the code running on GPU is not converging correctly.
>> > Has anyone experienced this problem?
>> >
>> > Sincerely,
>> > Ramoni Z. S. Azevedo
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230831/647a3c49/attachment.html>


More information about the petsc-users mailing list