[petsc-dev] [petsc-users] Poor weak scaling when solving successive linearsystems

Junchao Zhang jczhang at mcs.anl.gov
Tue Jun 26 15:25:36 CDT 2018


Mark,
  I re-do the -pc_type hypre experiment without openmp.  Now the job
finishes instead of running out of time. I have results with 216 processors
(see below). The 1728-processor job is still in the queue so I don't know
how it scales. But for the 216-processor one, the execution time is 245
seconds. With -pc_type gamg, the time is 107 seconds.  My options are

-ksp_norm_type unpreconditioned
-ksp_rtol 1E-6
-ksp_type cg
-log_view
-mesh_size 1E-4
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-nodes_per_proc 30
-pc_type hypre


It is a 7-point stencil code. Do you know other hypre options that I can
try to improve it?  Thanks.

--- Event Stage 2: Remaining Solves

KSPSolve            1000 1.0 2.4574e+02 1.0 4.48e+09 1.0 7.6e+06 7.2e+03
2.0e+04 97100100100100 100100100100100  3928
VecTDot            12000 1.0 6.5646e+00 2.2 6.48e+08 1.0 0.0e+00 0.0e+00
1.2e+04  2 14  0  0 60   2 14  0  0 60 21321
VecNorm             8000 1.0 9.7144e-01 1.2 4.32e+08 1.0 0.0e+00 0.0e+00
8.0e+03  0 10  0  0 40   0 10  0  0 40 96055
VecCopy             1000 1.0 7.9706e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              6000 1.0 1.7941e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY            12000 1.0 7.5738e-01 1.2 6.48e+08 1.0 0.0e+00 0.0e+00
0.0e+00  0 14  0  0  0   0 14  0  0  0 184806
VecAYPX             6000 1.0 4.6802e-01 1.3 2.97e+08 1.0 0.0e+00 0.0e+00
0.0e+00  0  7  0  0  0   0  7  0  0  0 137071
VecScatterBegin     7000 1.0 4.7924e-01 2.3 0.00e+00 0.0 7.6e+06 7.2e+03
0.0e+00  0  0100100  0   0  0100100  0     0
VecScatterEnd       7000 1.0 7.9303e-01 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult             7000 1.0 6.0762e+00 1.1 2.46e+09 1.0 7.6e+06 7.2e+03
0.0e+00  2 55100100  0   2 55100100  0 86894
PCApply             6000 1.0 2.3429e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 92  0  0  0  0  95  0  0  0  0     0


--Junchao Zhang

On Thu, Jun 14, 2018 at 5:45 PM, Junchao Zhang <jczhang at mcs.anl.gov> wrote:

> I tested -pc_gamg_repartition with 216 processors again. First I tested
> with these options
>
> -log_view \
> -ksp_rtol 1E-6 \
> -ksp_type cg \
> -ksp_norm_type unpreconditioned \
> -mg_levels_ksp_type richardson \
> -mg_levels_ksp_norm_type none \
> -mg_levels_pc_type sor \
> -mg_levels_ksp_max_it 1 \
> -mg_levels_pc_sor_its 1 \
> -mg_levels_esteig_ksp_type cg \
> -mg_levels_esteig_ksp_max_it 10 \
> -pc_type gamg \
> -pc_gamg_type agg \
> -pc_gamg_threshold 0.05 \
> -pc_gamg_type classical \
> -gamg_est_ksp_type cg \
> -pc_gamg_square_graph 10 \
> -pc_gamg_threshold 0.0
>
>
> then I tested with an extra -pc_gamg_repartition. With repartition, the
> time increased from 120s to 140s.  The code measures first KSPSolve and
> the remaining in separate stages, so the repartition time was not counted
> in the stage of interest. Actually, log_view says GMAG :repartition time
> (in the first event stage) is about 1.5 sec., so it is not a big deal. I
> also tested -pc_gamg_square_graph 4. It did not change the time.
> I tested hypre with options "-log_view -ksp_rtol 1E-6 -ksp_type cg
> -ksp_norm_type unpreconditioned -pc_type hypre"  and nothing else. The code
> ran out of time. In old tests, a job (1000 KSPSolve with 7 KSP iterations
> each) took 4 minutes. With hypre, 1 KSPSolve + 6 KSP iterations each, takes
> 6 minutes.
> I will test and profile the code on a single node, and apply some
> vecscatter optimizations I recently did to see what happens.
>
>
> --Junchao Zhang
>
> On Thu, Jun 14, 2018 at 11:03 AM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> And with 7-point stensils and no large material discontinuities you
>> probably want -pc_gamg_square_graph 10 -pc_gamg_threshold 0.0 and you
>> could test the square graph parameter (eg, 1,2,3,4).
>>
>> And I would definitely test hypre.
>>
>> On Thu, Jun 14, 2018 at 8:54 AM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>>
>>>> Just -pc_type hypre instead of -pc_type gamg.
>>>>
>>>>
>>> And you need to have configured PETSc with hypre.
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180626/ba4a86ac/attachment.html>


More information about the petsc-dev mailing list