[petsc-dev] number of mallocs inside KSP during factorization

Mon Dec 12 09:40:24 CST 2011

Thanks,

I didn't this number concerns system matrix which was assembled outside.

Regards,
Alexander

On 12.12.2011 15:09, Barry Smith wrote:
> http://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly
>
>
> On Dec 12, 2011, at 5:35 AM, Alexander Grayver wrote:
>
>> Hello,
>>
>> I use PETSs with MUMPS and looking carefully at the -ksp_view -ksp_monitor results I see:
>>
>> KSP Object:(fwd_) 64 MPI processes
>>    type: preonly
>>    maximum iterations=10000, initial guess is zero
>>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>    left preconditioning
>>    using NONE norm type for convergence test
>> PC Object:(fwd_) 64 MPI processes
>>    type: cholesky
>>      Cholesky: out-of-place factorization
>>      tolerance for zero pivot 2.22045e-14
>>      matrix ordering: natural
>>      factor fill ratio given 0, needed 0
>>        Factored matrix follows:
>>          Matrix Object:         64 MPI processes
>>            type: mpiaij
>>            rows=1048944, cols=1048944
>>            package used to perform factorization: mumps
>>            total: nonzeros=1266866685, allocated nonzeros=1266866685
>>            total number of mallocs used during MatSetValues calls =0
>>              MUMPS run parameters:
>>                SYM (matrix type):                   1
>>                PAR (host participation):            1
>>                ICNTL(1) (output for error):         6
>>                ICNTL(2) (output of diagnostic msg): 0
>>                ICNTL(3) (output for global info):   0
>>                ICNTL(4) (level of printing):        0
>>                ICNTL(5) (input mat struct):         0
>>                ICNTL(6) (matrix prescaling):        0
>>                ICNTL(7) (sequentia matrix ordering):5
>>                ICNTL(8) (scalling strategy):        77
>>                ICNTL(10) (max num of refinements):  0
>>                ICNTL(11) (error analysis):          0
>>                ICNTL(12) (efficiency control):                         1
>>                ICNTL(13) (efficiency control):                         0
>>                ICNTL(14) (percentage of estimated workspace increase): 30
>>                ICNTL(18) (input mat struct):                           3
>>                ICNTL(19) (Shur complement info):                       0
>>                ICNTL(20) (rhs sparse pattern):                         0
>>                ICNTL(21) (solution struct):                            1
>>                ICNTL(22) (in-core/out-of-core facility):               0
>>                ICNTL(23) (max size of memory can be allocated locally):0
>>                ICNTL(24) (detection of null pivot rows):               0
>>                ICNTL(25) (computation of a null space basis):          0
>>                ICNTL(26) (Schur options for rhs or solution):          0
>>                ICNTL(27) (experimental parameter):                     -8
>>                ICNTL(28) (use parallel or sequential ordering):        2
>>                ICNTL(29) (parallel ordering):                          0
>>                ICNTL(30) (user-specified set of entries in inv(A)):    0
>>                ICNTL(31) (factors is discarded in the solve phase):    0
>>                ICNTL(33) (compute determinant):                        0
>>                ...
>>    linear system matrix = precond matrix:
>>    Matrix Object:   64 MPI processes
>>      type: mpiaij
>>      rows=1048944, cols=1048944
>>      total: nonzeros=7251312, allocated nonzeros=11554449
>>      total number of mallocs used during MatSetValues calls =1071
>>        not using I-node (on process 0) routines
>>
>> The particularly interesting part are last 3 lines.
>> Where do these mallocs come from? Is it possible to reduce this number?
>>
>> Regards,
>> Alexander
>>