[petsc-dev] number of mallocs inside KSP during factorization

Mon Dec 12 07:00:36 CST 2011

On Mon, Dec 12, 2011 at 5:35 AM, Alexander Grayver
<agrayver at gfz-potsdam.de>wrote:

> **
> Hello,
>
> I use PETSs with MUMPS and looking carefully at the -ksp_view -ksp_monitor
> results I see:
>
> KSP Object:(fwd_) 64 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object:(fwd_) 64 MPI processes
>   type: cholesky
>     Cholesky: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: natural
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Matrix Object:         64 MPI processes
>           type: mpiaij
>           rows=1048944, cols=1048944
>           package used to perform factorization: mumps
>           total: nonzeros=1266866685, allocated nonzeros=1266866685
>           total number of mallocs used during MatSetValues calls =0
>             MUMPS run parameters:
>               SYM (matrix type):                   1
>               PAR (host participation):            1
>               ICNTL(1) (output for error):         6
>               ICNTL(2) (output of diagnostic msg): 0
>               ICNTL(3) (output for global info):   0
>               ICNTL(4) (level of printing):        0
>               ICNTL(5) (input mat struct):         0
>               ICNTL(6) (matrix prescaling):        0
>               ICNTL(7) (sequentia matrix ordering):5
>               ICNTL(8) (scalling strategy):        77
>               ICNTL(10) (max num of refinements):  0
>               ICNTL(11) (error analysis):          0
>               ICNTL(12) (efficiency control):                         1
>               ICNTL(13) (efficiency control):                         0
>               ICNTL(14) (percentage of estimated workspace increase): 30
>               ICNTL(18) (input mat struct):                           3
>               ICNTL(19) (Shur complement info):                       0
>               ICNTL(20) (rhs sparse pattern):                         0
>               ICNTL(21) (solution struct):                            1
>               ICNTL(22) (in-core/out-of-core facility):               0
>               ICNTL(23) (max size of memory can be allocated locally):0
>               ICNTL(24) (detection of null pivot rows):               0
>               ICNTL(25) (computation of a null space basis):          0
>               ICNTL(26) (Schur options for rhs or solution):          0
>               ICNTL(27) (experimental parameter):                     -8
>               ICNTL(28) (use parallel or sequential ordering):        2
>               ICNTL(29) (parallel ordering):                          0
>               ICNTL(30) (user-specified set of entries in inv(A)):    0
>               ICNTL(31) (factors is discarded in the solve phase):    0
>               ICNTL(33) (compute determinant):                        0
>               ...
>   linear system matrix = precond matrix:
>   Matrix Object:   64 MPI processes
>     type: mpiaij
>     rows=1048944, cols=1048944
> *    total: nonzeros=7251312, allocated nonzeros=11554449
>     total number of mallocs used during MatSetValues calls =1071*
>       not using I-node (on process 0) routines
>
> The particularly interesting part are last 3 lines.
> Where do these mallocs come from? Is it possible to reduce this number?
>

Yes, it looks like you are not preallocating correctly.

  Matt

> Regards,
> Alexander
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111212/4038edbc/attachment.html>