[petsc-users] Parallel efficiency of the gmres solver with ASM

Lei Shi stoneszone at gmail.com
Thu Jun 25 05:51:32 CDT 2015


Hello,

I'm trying to improve the parallel efficiency of gmres solve in my. In my
CFD solver, Petsc gmres is used to solve the linear system generated by the
Newton's method. To test its efficiency, I started with a very simple
inviscid subsonic 3D flow as the first testcase. The parallel efficiency of
gmres solve with asm as the preconditioner is very bad. The results are
from our latest cluster. Right now, I'm only looking at the wclock time of
the ksp_solve.

   1. First I tested ASM with gmres and ilu 0 for the sub domain , the cpu
   time of 2 cores is almost the same as the serial run. Here is the options
   for this case

-ksp_type gmres  -ksp_max_it 100 -ksp_rtol 1e-5 -ksp_atol 1e-50
-ksp_gmres_restart 30 -ksp_pc_side right
-pc_type asm -sub_ksp_type gmres -sub_ksp_rtol 0.001 -sub_ksp_atol 1e-30
-sub_ksp_max_it 1000 -sub_pc_type ilu -sub_pc_factor_levels 0
-sub_pc_factor_fill 1.9

The iteration numbers increase a lot for parallel run.

coresiterationserrpetsc solve wclock timespeedupefficiency121.15E-0411.95125
2.05E-0210.51.010.50462.19E-027.641.390.34






      2.  Then I tested ASM with ilu 0 as the preconditoner only, the cpu
time of 2 cores is better than the 1st test, but the speedup is still very
bad. Here is the options i'm using

-ksp_type gmres  -ksp_max_it 100 -ksp_rtol 1e-5 -ksp_atol 1e-50
-ksp_gmres_restart 30 -ksp_pc_side right
-pc_type asm -sub_pc_type ilu -sub_pc_factor_levels 0  -sub_pc_factor_fill
1.9

coresiterationserrpetsc solve cpu timespeedupefficiency1104.54E-0410.681211
9.55E-048.21.300.654123.59E-045.262.030.50






   Those results are from a third order "DG" scheme with a very coarse 3D
mesh (480 elements). I believe I should get some speedups for this test
even on this coarse mesh.

  My question is why does the asm with a local solve take much longer time
than the asm as a preconditioner only? Also the accuracy is very bad too I
have tested changing the overlap of asm to 2, but make it even worse.

  If I used a larger mesh ~4000 elements, the 2nd case with asm as the
preconditioner gives me a better speedup, but still not very good.


coresiterationserrpetsc solve cpu timespeedupefficiency171.91E-0297.32127
2.07E-0264.941.50.74472.61E-0236.972.60.65



Attached are the log_summary dumped from petsc, any suggestions are
welcome. I really appreciate it.


Sincerely Yours,

Lei Shi
---------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150625/f01f9b00/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: proc2_asm_sub_ksp.dat
Type: application/octet-stream
Size: 12233 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150625/f01f9b00/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: proc1_asm_sub_ksp.dat
Type: application/octet-stream
Size: 11951 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150625/f01f9b00/attachment-0005.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: proc2_asm_pconly.dat
Type: application/octet-stream
Size: 12498 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150625/f01f9b00/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: proc1_asm_pconly.dat
Type: application/octet-stream
Size: 12087 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150625/f01f9b00/attachment-0007.obj>


More information about the petsc-users mailing list