[petsc-users] How to speed up geometric multigrid

Wed Oct 2 20:22:19 CDT 2013

   You will need to fix DMCreateInterpolation_DA_3D_Q0() to support periodic boundary conditions. It might be really easy, I'm not sure why it has that condition currently.

   Barry

On Oct 2, 2013, at 6:33 PM, Michele Rosso <mrosso at uci.edu> wrote:

> Barry,
> 
> thank you, I tried but it seems the periodic boundaries are problematic:
> 
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Cannot handle periodic grid in x!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.2, Jul, 02, 2013
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: ./hit on a interlagos-64idx-pgi-opt named nid25146 by Unknown Wed Oct  2 18:22:54 2013
> [0]PETSC ERROR: Libraries linked from /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib
> [0]PETSC ERROR: Configure run at Wed Aug 28 23:25:43 2013
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib  -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: DMCreateInterpolation_DA_3D_Q0() line 597 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/dm/impls/da/dainterp.c
> [0]PETSC ERROR: DMCreateInterpolation_DA() line 1015 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/dm/impls/da/dainterp.c
> [0]PETSC ERROR: DMCreateInterpolation() line 801 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/dm/interface/dm.c
> [0]PETSC ERROR: PCSetUp_MG() line 602 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 890 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 278 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 399 in /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/src/ksp/ksp/interface/itfunc.c
> 
> On a side note I took a look at  ex34.c  and I noticed that the nullspace is removed not only from the matrix  but from the rhs as well.
> Why is this done? Should I do the same?
> Thank you.
> 
> Michele
> 
> 
> On 10/02/2013 03:57 PM, Barry Smith wrote:
>>    Ok, so piecewise constant pressures. The default DMDA multigrid is using piecewise linear interpolation. Isn't going to work well at all. You need to first add   ierr = DMDASetInterpolationType(da, DMDA_Q0);CHKERRQ(ierr); after you create the DM.  You should be seeing much better convergence. see ksp/ksp/examples/tutorials/ex34.c
>> 
>> 
>>    Barry
>> 
>> 
>> 
>> On Oct 2, 2013, at 5:25 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> 
>>> On Wed, Oct 2, 2013 at 5:12 PM, Michele Rosso <mrosso at uci.edu> wrote:
>>> Barry,
>>> 
>>> sorry for not replying to your other e-mail earlier.
>>> The equation I am solving is:
>>> 
>>> <tblatex-5.png>
>>> 
>>> where <tblatex-4.png> is the pressure field, <tblatex-3.png> the density field and <tblatex-2.png>  the velocity field.
>>> Since I am using finite difference on a staggered grid, the pressure is defined on "cell" centers, while the velocity components on "cell" faces, even if
>>> no cell is actually defined.
>>> 
>>> If you just prescribe a constant rho, do you get 3-5 iterates? What about a simple step through the domain?
>>> Starting on the hardest problem does not sound like that way to understand what is going on.
>>> 
>>>   Matt
>>>  I am simulating a bi-phase flow, thus both density and pressure are discontinuos, but not the velocity (no mass trasfer is included at the moment).
>>> Therefore the right-hand-side (rhs) of the above equation does not have jumps, while $p$ and $rho$ do.
>>> In order to deal with such jumps, I am using a Ghost Fluid approach. Therefore the resulting linear system is slighly different from the typical Poisson system, though
>>> simmetry and diagonal dominance of the matrix are mantained.
>>> The boundary conditions are periodic in all the three space directions, therefore the system is singular. Thus I removed the nullspace of the matrix by using:
>>> 
>>>         call MatNullSpaceCreate( PETSC_COMM_WORLD,PETSC_TRUE,PETSC_NULL_INTEGER,&
>>>                                & PETSC_NULL_INTEGER,nullspace,ierr)
>>>         call KSPSetNullspace(ksp,nullspace,ierr)
>>>         call MatNullSpaceDestroy(nullspace,ierr)
>>> 
>>> Hope this helps. Please let me know if you need any other info.
>>> Also, I use the pressure at the previous time step as starting point for the solve. Could this be a reason why the convergence is so slow?
>>> Thanks a lot,
>>> 
>>> Michele
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 10/02/2013 11:39 AM, Barry Smith wrote:
>>>>   Something is wrong, you should be getting better convergence. Please answer my other email.
>>>> 
>>>> 
>>>> On Oct 2, 2013, at 1:10 PM, Michele Rosso
>>>> <mrosso at uci.edu>
>>>>  wrote:
>>>> 
>>>> 
>>>>> Thank you all for your contribution.
>>>>> So far the fastest solution is still the initial one proposed by Jed in an earlier round:
>>>>> 
>>>>> -ksp_atol 1e-9  -ksp_monitor_true_residual  -ksp_view  -log_summary -mg_coarse_pc_factor_mat_solver_package superlu_dist
>>>>> -mg_coarse_pc_type lu    -mg_levels_ksp_max_it 3 -mg_levels_ksp_type richardson  -options_left -pc_mg_galerkin
>>>>> -pc_mg_levels 5  -pc_mg_log  -pc_type mg
>>>>> 
>>>>> where I used  -mg_levels_ksp_max_it 3  as Barry suggested instead of  -mg_levels_ksp_max_it 1.
>>>>> I attached the diagnostics for this case. Any further idea?
>>>>> Thank you,
>>>>> 
>>>>> Michele
>>>>> 
>>>>> 
>>>>> On 10/01/2013 11:44 PM, Barry Smith wrote:
>>>>> 
>>>>>> On Oct 2, 2013, at 12:28 AM, Jed Brown <jedbrown at mcs.anl.gov>
>>>>>>  wrote:
>>>>>> 
>>>>>> 
>>>>>>> "Mark F. Adams" <mfadams at lbl.gov>
>>>>>>>  writes:
>>>>>>> 
>>>>>>>> run3.txt uses:
>>>>>>>> 
>>>>>>>> -ksp_type richardson
>>>>>>>> 
>>>>>>>> This is bad and I doubt anyone recommended it intentionally.
>>>>>>>> 
>>>>>>    Hell this is normal multigrid without a Krylov accelerator. Under normal circumstances with geometric multigrid this should be fine, often the best choice.
>>>>>> 
>>>>>> 
>>>>>>> I would have expected FGMRES, but Barry likes Krylov smoothers and
>>>>>>> Richardson is one of a few methods that can tolerate nonlinear
>>>>>>> preconditioners.
>>>>>>> 
>>>>>>> 
>>>>>>>> You also have, in this file,
>>>>>>>> 
>>>>>>>> -mg_levels_ksp_type gmres
>>>>>>>> 
>>>>>>>> did you or the recommenders mean
>>>>>>>> 
>>>>>>>> -mg_levels_ksp_type richardson  ???
>>>>>>>> 
>>>>>>>> you are using gmres here, which forces you to use fgmres in the outer solver.  This is a safe thing to use you if you apply your BCa symmetrically with a low order discretization then
>>>>>>>> 
>>>>>>>> -ksp_type cg
>>>>>>>> -mg_levels_ksp_type richardson
>>>>>>>> -mg_levels_pc_type sor
>>>>>>>> 
>>>>>>>> is what I'd recommend.
>>>>>>>> 
>>>>>>> I thought that was tried in an earlier round.
>>>>>>> 
>>>>>>> I don't understand why SOR preconditioning in the Krylov smoother is so
>>>>>>> drastically more expensive than BJacobi/ILU and why SOR is called so
>>>>>>> many more times even though the number of outer iterations
>>>>>>> 
>>>>>>> bjacobi: PCApply              322 1.0 4.1021e+01 1.0 6.44e+09 1.0 3.0e+07 1.6e+03 4.5e+04 74 86 98 88 92 28160064317351226 20106
>>>>>>> bjacobi: KSPSolve              46 1.0 4.6268e+01 1.0 7.52e+09 1.0 3.0e+07 1.8e+03 4.8e+04 83100100 99 99 31670065158291309 20800
>>>>>>> 
>>>>>>> sor:     PCApply             1132 1.0 1.5532e+02 1.0 2.30e+10 1.0 1.0e+08 1.6e+03 1.6e+05 69 88 99 88 93 21871774317301274 18987
>>>>>>> sor:     KSPSolve             201 1.0 1.7101e+02 1.0 2.63e+10 1.0 1.1e+08 1.8e+03 1.7e+05 75100100 99 98 24081775248221352 19652
>>>>>>> 
>>>>> <best.txt>
>>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>> 
>