[petsc-users] Strange GAMG performance for mixed FE formulation

Mark Adams mfadams at lbl.gov
Thu Mar 3 14:48:53 CST 2016


You have a very sparse 3D problem, with 9 non-zeros per row.   It is
coarsening very slowly and creating huge coarse grids. which are expensive
to construct.  The superlinear speedup is from cache effects, most likely.
First try with:

-pc_gamg_square_graph 10

ML must have some AI in there to do this automatically, because gamg are
pretty similar algorithmically.  There is a threshold parameter that is
important (-pc_gamg_threshold <0.0>) and I think ML has the same default.
ML is doing OK, but I would guess that if you use like 0.02 for MLs
threshold you would see some improvement.

Hypre is doing pretty bad also.  I suspect that it is getting confused as
well.  I know less about how to deal with hypre.

If you use -info and grep on GAMG you will see about 20 lines that will
tell you the number of equations on level and the average number of
non-zeros per row.  In 3D the reduction per level should be -- very
approximately -- 30x and the number of non-zeros per row should not
explode, but getting up to several hundred is OK.

If you care to test this we should be able to get ML and GAMG to agree
pretty well.  ML is a nice solver, but our core numerics should be about
the same.  I tested this on a 3D elasticity problem a few years ago.  That
said, I think your ML solve is pretty good.

Mark




On Thu, Mar 3, 2016 at 4:36 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

> On 02/03/16 22:28, Justin Chang wrote:
> ...
>
>
> >         Down solver (pre-smoother) on level 3
> >
> >           KSP Object:          (solver_fieldsplit_1_mg_levels_3_)
> >             linear system matrix = precond matrix:
> ...
> >             Mat Object:             1 MPI processes
> >
> >               type: seqaij
> >
> >               rows=52147, cols=52147
> >
> >               total: nonzeros=38604909, allocated nonzeros=38604909
> >
> >               total number of mallocs used during MatSetValues calls =2
> >
> >                 not using I-node routines
> >
> >         Down solver (pre-smoother) on level 4
> >
> >           KSP Object:          (solver_fieldsplit_1_mg_levels_4_)
> >             linear system matrix followed by preconditioner matrix:
> >
> >             Mat Object:            (solver_fieldsplit_1_)
>
> ...
> >
> >             Mat Object:             1 MPI processes
> >
> >               type: seqaij
> >
> >               rows=384000, cols=384000
> >
> >               total: nonzeros=3416452, allocated nonzeros=3416452
>
>
> This looks pretty suspicious to me.  The original matrix on the finest
> level has 3.8e5 rows and ~3.4e6 nonzeros.  The next level up, the
> coarsening produces 5.2e4 rows, but 38e6 nonzeros.
>
> FWIW, although Justin's PETSc is from Oct 2015, I get the same
> behaviour with:
>
> ad5697c (Master as of 1st March).
>
> If I compare with the coarse operators that ML produces on the same
> problem:
>
> The original matrix has, again:
>
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=384000, cols=384000
>           total: nonzeros=3416452, allocated nonzeros=3416452
>           total number of mallocs used during MatSetValues calls=0
>             not using I-node routines
>
> While the next finest level has:
>
>             Mat Object:             1 MPI processes
>               type: seqaij
>               rows=65258, cols=65258
>               total: nonzeros=1318400, allocated nonzeros=1318400
>               total number of mallocs used during MatSetValues calls=0
>                 not using I-node routines
>
> So we have 6.5e4 rows and 1.3e6 nonzeros, which seems more plausible.
>
> Cheers,
>
> Lawrence
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160303/7d1ea9fe/attachment.html>


More information about the petsc-users mailing list