[petsc-dev] ASM for each field solve on GPUs

Mark Adams mfadams at lbl.gov
Thu Dec 31 15:25:08 CST 2020


Ha! This seems to work. It Iooks good as far as I can tell. I don't think
this is just a direct solver.
Thanks

 ....
    3 SNES Function norm 1.557752448861e-11

*      0 KSP Residual norm 1.557752448861e-11      1 KSP Residual norm
7.987154898471e-27*
KSP Object: 1 MPI processes
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using NONE norm type for convergence test
PC Object: 1 MPI processes
  type: fieldsplit

*    FieldSplit with ADDITIVE composition: total splits = 2*    Solver info
for each split is in the following KSP objects:
  Split number 0 Defined by IS
  KSP Object: (fieldsplit_e_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (fieldsplit_e_) 1 MPI processes
    type: lu
      out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: nd
      factor fill ratio given 5., needed 1.30805
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=448, cols=448
            package used to perform factorization: petsc
            total: nonzeros=14038, allocated nonzeros=14038
              using I-node routines: found 175 nodes, limit used is 5
    linear system matrix = precond matrix:
    Mat Object: (fieldsplit_e_) 1 MPI processes
      type: seqaij
      rows=448, cols=448
      total: nonzeros=10732, allocated nonzeros=10732
      total number of mallocs used during MatSetValues calls=0
        using I-node routines: found 197 nodes, limit used is 5
  Split number 1 Defined by IS
  KSP Object: (fieldsplit_i1_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (fieldsplit_i1_) 1 MPI processes
    type: lu
      out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: nd
      factor fill ratio given 5., needed 1.30805
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=448, cols=448
            package used to perform factorization: petsc
            total: nonzeros=14038, allocated nonzeros=14038
              using I-node routines: found 175 nodes, limit used is 5
    linear system matrix = precond matrix:
    Mat Object: (fieldsplit_i1_) 1 MPI processes
      type: seqaij
      rows=448, cols=448
      total: nonzeros=10732, allocated nonzeros=10732
      total number of mallocs used during MatSetValues calls=0
        using I-node routines: found 197 nodes, limit used is 5
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
    type: seqaij
    rows=896, cols=896
    total: nonzeros=21464, allocated nonzeros=42928
    total number of mallocs used during MatSetValues calls=0
      using I-node routines: found 398 nodes, limit used is 5
    4 SNES Function norm 5.154807006139e-14
  Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 4
      TSAdapt basic arkimex 0:1bee step   0 accepted t=0          +
1.000e-02 dt=1.100e-02  wlte=0.00161  wltea=   -1 wlter=   -1
1 TS dt 0.011 time 0.01
  1) species-0: charge density= -1.6022862392985e+01 z-momentum=
 1.9463853500826e-19 energy=  9.6063874253791e+04
  1) species-1: charge density=  1.6029890912678e+01 z-momentum=
-1.6132625562313e-18 energy=  9.6333617535820e+04
 1) Total: charge density=  7.0285196929696e-03, momentum=
-1.4186240212230e-18, energy=  1.9239749178961e+05 (m_i[0]/m_e = 1835.47,
50 cells)
testSpitzer    1) time= 1.000e-02 n_e=  1.000e+00 E=  0.000e+00 J=
 5.338e-08 J_re= -0.000e+00 -0 % Te_kev=  3.995e+00 Z_eff=1. E/J to eta
ratio=0. (diff=-0.) constant E
[0] parallel consistency check OK
#PETSc Option Table entries:
-dm_landau_amr_levels_max 8
-dm_landau_amr_post_refine 0
-dm_landau_device_type cpu
-dm_landau_domain_radius 5
-dm_landau_Ez 0
-dm_landau_ion_charges 1
-dm_landau_ion_masses 1
-dm_landau_n 1,1
-dm_landau_thermal_temps 4,4
-dm_landau_type p4est
-dm_preallocate_only
-ex2_connor_e_field_units
-ex2_impurity_source_type pulse
-ex2_plot_dt 1
-ex2_pulse_rate 2e+0
-ex2_pulse_start_time 32
-ex2_pulse_width_time 15
-ex2_t_cold 1
-ex2_test_type spitzer

*-fieldsplit_e_pc_type lu-fieldsplit_i1_pc_type lu*
-info :dm
-ksp_monitor
-ksp_type preonly
-ksp_view
-options_left

*-pc_fieldsplit_type additive-pc_type fieldsplit*
-petscspace_degree 3
-snes_converged_reason
-snes_max_it 15
-snes_monitor
-snes_rtol 1.e-14
-snes_stol 1.e-14
-ts_adapt_clip .25,1.1
-ts_adapt_dt_max 1.
-ts_adapt_dt_min 1e-4
-ts_adapt_monitor
-ts_adapt_scale_solve_failed 0.75
-ts_adapt_time_step_increase_delay 5
-ts_arkimex_type 1bee
-ts_dt .1e-1
-ts_exact_final_time stepover
-ts_max_snes_failures -1
-ts_max_steps 1
-ts_max_time 100
-ts_monitor
-ts_rtol 1e-3
-ts_type arkimex
#End of PETSc Option Table entries
There are no unused options.
(base) 16:19 master= ~/Codes/petsc/src/ts/utils/dmplexlandau/tutorials$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20201231/3cd4dc0a/attachment.html>


More information about the petsc-dev mailing list