[petsc-dev] ASM for each field solve on GPUs
Mark Adams
mfadams at lbl.gov
Thu Dec 31 15:26:44 CST 2020
Oh, and I use:
-fieldsplit_e_pc_type lu
-fieldsplit_i1_pc_type lu
and
-fieldsplit_pc_type lu
does not seem to work.
On Thu, Dec 31, 2020 at 4:25 PM Mark Adams <mfadams at lbl.gov> wrote:
> Ha! This seems to work. It Iooks good as far as I can tell. I don't think
> this is just a direct solver.
> Thanks
>
> ....
> 3 SNES Function norm 1.557752448861e-11
>
> * 0 KSP Residual norm 1.557752448861e-11 1 KSP Residual norm
> 7.987154898471e-27*
> KSP Object: 1 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: 1 MPI processes
> type: fieldsplit
>
> * FieldSplit with ADDITIVE composition: total splits = 2* Solver
> info for each split is in the following KSP objects:
> Split number 0 Defined by IS
> KSP Object: (fieldsplit_e_) 1 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (fieldsplit_e_) 1 MPI processes
> type: lu
> out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> matrix ordering: nd
> factor fill ratio given 5., needed 1.30805
> Factored matrix follows:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=448, cols=448
> package used to perform factorization: petsc
> total: nonzeros=14038, allocated nonzeros=14038
> using I-node routines: found 175 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: (fieldsplit_e_) 1 MPI processes
> type: seqaij
> rows=448, cols=448
> total: nonzeros=10732, allocated nonzeros=10732
> total number of mallocs used during MatSetValues calls=0
> using I-node routines: found 197 nodes, limit used is 5
> Split number 1 Defined by IS
> KSP Object: (fieldsplit_i1_) 1 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (fieldsplit_i1_) 1 MPI processes
> type: lu
> out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> matrix ordering: nd
> factor fill ratio given 5., needed 1.30805
> Factored matrix follows:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=448, cols=448
> package used to perform factorization: petsc
> total: nonzeros=14038, allocated nonzeros=14038
> using I-node routines: found 175 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: (fieldsplit_i1_) 1 MPI processes
> type: seqaij
> rows=448, cols=448
> total: nonzeros=10732, allocated nonzeros=10732
> total number of mallocs used during MatSetValues calls=0
> using I-node routines: found 197 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=896, cols=896
> total: nonzeros=21464, allocated nonzeros=42928
> total number of mallocs used during MatSetValues calls=0
> using I-node routines: found 398 nodes, limit used is 5
> 4 SNES Function norm 5.154807006139e-14
> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 4
> TSAdapt basic arkimex 0:1bee step 0 accepted t=0 +
> 1.000e-02 dt=1.100e-02 wlte=0.00161 wltea= -1 wlter= -1
> 1 TS dt 0.011 time 0.01
> 1) species-0: charge density= -1.6022862392985e+01 z-momentum=
> 1.9463853500826e-19 energy= 9.6063874253791e+04
> 1) species-1: charge density= 1.6029890912678e+01 z-momentum=
> -1.6132625562313e-18 energy= 9.6333617535820e+04
> 1) Total: charge density= 7.0285196929696e-03, momentum=
> -1.4186240212230e-18, energy= 1.9239749178961e+05 (m_i[0]/m_e = 1835.47,
> 50 cells)
> testSpitzer 1) time= 1.000e-02 n_e= 1.000e+00 E= 0.000e+00 J=
> 5.338e-08 J_re= -0.000e+00 -0 % Te_kev= 3.995e+00 Z_eff=1. E/J to eta
> ratio=0. (diff=-0.) constant E
> [0] parallel consistency check OK
> #PETSc Option Table entries:
> -dm_landau_amr_levels_max 8
> -dm_landau_amr_post_refine 0
> -dm_landau_device_type cpu
> -dm_landau_domain_radius 5
> -dm_landau_Ez 0
> -dm_landau_ion_charges 1
> -dm_landau_ion_masses 1
> -dm_landau_n 1,1
> -dm_landau_thermal_temps 4,4
> -dm_landau_type p4est
> -dm_preallocate_only
> -ex2_connor_e_field_units
> -ex2_impurity_source_type pulse
> -ex2_plot_dt 1
> -ex2_pulse_rate 2e+0
> -ex2_pulse_start_time 32
> -ex2_pulse_width_time 15
> -ex2_t_cold 1
> -ex2_test_type spitzer
>
> *-fieldsplit_e_pc_type lu-fieldsplit_i1_pc_type lu*
> -info :dm
> -ksp_monitor
> -ksp_type preonly
> -ksp_view
> -options_left
>
> *-pc_fieldsplit_type additive-pc_type fieldsplit*
> -petscspace_degree 3
> -snes_converged_reason
> -snes_max_it 15
> -snes_monitor
> -snes_rtol 1.e-14
> -snes_stol 1.e-14
> -ts_adapt_clip .25,1.1
> -ts_adapt_dt_max 1.
> -ts_adapt_dt_min 1e-4
> -ts_adapt_monitor
> -ts_adapt_scale_solve_failed 0.75
> -ts_adapt_time_step_increase_delay 5
> -ts_arkimex_type 1bee
> -ts_dt .1e-1
> -ts_exact_final_time stepover
> -ts_max_snes_failures -1
> -ts_max_steps 1
> -ts_max_time 100
> -ts_monitor
> -ts_rtol 1e-3
> -ts_type arkimex
> #End of PETSc Option Table entries
> There are no unused options.
> (base) 16:19 master= ~/Codes/petsc/src/ts/utils/dmplexlandau/tutorials$
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20201231/a769915b/attachment-0001.html>
More information about the petsc-dev
mailing list