[petsc-users] KSPSetUp does not scale

Jed Brown jedbrown at mcs.anl.gov
Mon Nov 19 06:33:27 CST 2012

Always, always, always send -log_summary when asking about performance.

On Mon, Nov 19, 2012 at 11:26 AM, Thomas Witkowski <
thomas.witkowski at tu-dresden.de> wrote:

> I have some scaling problem in KSPSetUp, maybe some of you can help me to
> fix it. It takes 4.5 seconds on 64 cores, and 4.0 cores on 128 cores. The
> matrix has around 11 million rows and is not perfectly balanced, but the
> number of maximum rows per core in the 128 cases is exactly halfe of the
> number in the case when using 64 cores. Besides the scaling, why does the
> setup takes so long? I though that just some objects are created but no
> calculation is going on!
> The KSPView on the corresponding solver objects is as follows:
> KSP Object:(ns_) 64 MPI processes
>   type: fgmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=100, initial guess is zero
>   tolerances:  relative=1e-06, absolute=1e-08, divergence=10000
>   right preconditioning
>   has attached null space
>   using UNPRECONDITIONED norm type for convergence test
> PC Object:(ns_) 64 MPI processes
>   type: fieldsplit
>     FieldSplit with Schur preconditioner, factorization FULL
>     Preconditioner for the Schur complement formed from the block diagonal
> part of A11
>     Split info:
>     Split number 0 Defined by IS
>     Split number 1 Defined by IS
>     KSP solver for A00 block
>       KSP Object:      (ns_fieldsplit_velocity_)       64 MPI processes
>         type: preonly
>         maximum iterations=10000, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>         left preconditioning
>         using DEFAULT norm type for convergence test
>       PC Object:      (ns_fieldsplit_velocity_)       64 MPI processes
>         type: none
>         linear system matrix = precond matrix:
>         Matrix Object:         64 MPI processes
>           type: mpiaij
>           rows=11068107, cols=11068107
>           total: nonzeros=315206535, allocated nonzeros=315206535
>           total number of mallocs used during MatSetValues calls =0
>             not using I-node (on process 0) routines
>     KSP solver for S = A11 - A10 inv(A00) A01
>       KSP Object:      (ns_fieldsplit_pressure_)       64 MPI processes
>         type: gmres
>           GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>           GMRES: happy breakdown tolerance 1e-30
>         maximum iterations=10000, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>         left preconditioning
>         using DEFAULT norm type for convergence test
>       PC Object:      (ns_fieldsplit_pressure_)       64 MPI processes
>         type: none
>         linear system matrix followed by preconditioner matrix:
>         Matrix Object:         64 MPI processes
>           type: schurcomplement
>           rows=469678, cols=469678
>             Schur complement A11 - A10 inv(A00) A01
>             A11
>               Matrix Object:               64 MPI processes
>                 type: mpiaij
>                 rows=469678, cols=469678
>                 total: nonzeros=0, allocated nonzeros=0
>                 total number of mallocs used during MatSetValues calls =0
>                   using I-node (on process 0) routines: found 1304 nodes,
> limit used is 5
>             A10
>               Matrix Object:               64 MPI processes
>                 type: mpiaij
>                 rows=469678, cols=11068107
>                 total: nonzeros=89122957, allocated nonzeros=89122957
>                 total number of mallocs used during MatSetValues calls =0
>                   not using I-node (on process 0) routines
>             KSP of A00
>               KSP Object: (ns_fieldsplit_velocity_)               64 MPI
> processes
>                 type: preonly
>                 maximum iterations=10000, initial guess is zero
>                 tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000
>                 left preconditioning
>                 using DEFAULT norm type for convergence test
>               PC Object: (ns_fieldsplit_velocity_)               64 MPI
> processes
>                 type: none
>                 linear system matrix = precond matrix:
>                 Matrix Object:                 64 MPI processes
>                   type: mpiaij
>                   rows=11068107, cols=11068107
>                   total: nonzeros=315206535, allocated nonzeros=315206535
>                   total number of mallocs used during MatSetValues calls =0
>                     not using I-node (on process 0) routines
>             A01
>               Matrix Object:               64 MPI processes
>                 type: mpiaij
>                 rows=11068107, cols=469678
>                 total: nonzeros=88821041, allocated nonzeros=88821041
>                 total number of mallocs used during MatSetValues calls =0
>                   not using I-node (on process 0) routines
>         Matrix Object:         64 MPI processes
>           type: mpiaij
>           rows=469678, cols=469678
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             using I-node (on process 0) routines: found 1304 nodes, limit
> used is 5
>   linear system matrix = precond matrix:
>   Matrix Object:   64 MPI processes
>     type: mpiaij
>     rows=11537785, cols=11537785
>     total: nonzeros=493150533, allocated nonzeros=510309207
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node (on process 0) routines
> Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121119/9da0659b/attachment.html>

More information about the petsc-users mailing list