[petsc-dev] number of mallocs inside KSP during factorization

Mon Dec 12 08:09:07 CST 2011

http://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly

On Dec 12, 2011, at 5:35 AM, Alexander Grayver wrote:

> Hello,
> 
> I use PETSs with MUMPS and looking carefully at the -ksp_view -ksp_monitor results I see:
> 
> KSP Object:(fwd_) 64 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object:(fwd_) 64 MPI processes
>   type: cholesky
>     Cholesky: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: natural
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Matrix Object:         64 MPI processes
>           type: mpiaij
>           rows=1048944, cols=1048944
>           package used to perform factorization: mumps
>           total: nonzeros=1266866685, allocated nonzeros=1266866685
>           total number of mallocs used during MatSetValues calls =0
>             MUMPS run parameters:
>               SYM (matrix type):                   1 
>               PAR (host participation):            1 
>               ICNTL(1) (output for error):         6 
>               ICNTL(2) (output of diagnostic msg): 0 
>               ICNTL(3) (output for global info):   0 
>               ICNTL(4) (level of printing):        0 
>               ICNTL(5) (input mat struct):         0 
>               ICNTL(6) (matrix prescaling):        0 
>               ICNTL(7) (sequentia matrix ordering):5 
>               ICNTL(8) (scalling strategy):        77 
>               ICNTL(10) (max num of refinements):  0 
>               ICNTL(11) (error analysis):          0 
>               ICNTL(12) (efficiency control):                         1 
>               ICNTL(13) (efficiency control):                         0 
>               ICNTL(14) (percentage of estimated workspace increase): 30 
>               ICNTL(18) (input mat struct):                           3 
>               ICNTL(19) (Shur complement info):                       0 
>               ICNTL(20) (rhs sparse pattern):                         0 
>               ICNTL(21) (solution struct):                            1 
>               ICNTL(22) (in-core/out-of-core facility):               0 
>               ICNTL(23) (max size of memory can be allocated locally):0 
>               ICNTL(24) (detection of null pivot rows):               0 
>               ICNTL(25) (computation of a null space basis):          0 
>               ICNTL(26) (Schur options for rhs or solution):          0 
>               ICNTL(27) (experimental parameter):                     -8 
>               ICNTL(28) (use parallel or sequential ordering):        2 
>               ICNTL(29) (parallel ordering):                          0 
>               ICNTL(30) (user-specified set of entries in inv(A)):    0 
>               ICNTL(31) (factors is discarded in the solve phase):    0 
>               ICNTL(33) (compute determinant):                        0 
>               ...
>   linear system matrix = precond matrix:
>   Matrix Object:   64 MPI processes
>     type: mpiaij
>     rows=1048944, cols=1048944
>     total: nonzeros=7251312, allocated nonzeros=11554449
>     total number of mallocs used during MatSetValues calls =1071
>       not using I-node (on process 0) routines
> 
> The particularly interesting part are last 3 lines.
> Where do these mallocs come from? Is it possible to reduce this number?
> 
> Regards,
> Alexander
>