[petsc-dev] number of mallocs inside KSP during factorization
Matthew Knepley
knepley at gmail.com
Mon Dec 12 07:00:36 CST 2011
On Mon, Dec 12, 2011 at 5:35 AM, Alexander Grayver
<agrayver at gfz-potsdam.de>wrote:
> **
> Hello,
>
> I use PETSs with MUMPS and looking carefully at the -ksp_view -ksp_monitor
> results I see:
>
> KSP Object:(fwd_) 64 MPI processes
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using NONE norm type for convergence test
> PC Object:(fwd_) 64 MPI processes
> type: cholesky
> Cholesky: out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> matrix ordering: natural
> factor fill ratio given 0, needed 0
> Factored matrix follows:
> Matrix Object: 64 MPI processes
> type: mpiaij
> rows=1048944, cols=1048944
> package used to perform factorization: mumps
> total: nonzeros=1266866685, allocated nonzeros=1266866685
> total number of mallocs used during MatSetValues calls =0
> MUMPS run parameters:
> SYM (matrix type): 1
> PAR (host participation): 1
> ICNTL(1) (output for error): 6
> ICNTL(2) (output of diagnostic msg): 0
> ICNTL(3) (output for global info): 0
> ICNTL(4) (level of printing): 0
> ICNTL(5) (input mat struct): 0
> ICNTL(6) (matrix prescaling): 0
> ICNTL(7) (sequentia matrix ordering):5
> ICNTL(8) (scalling strategy): 77
> ICNTL(10) (max num of refinements): 0
> ICNTL(11) (error analysis): 0
> ICNTL(12) (efficiency control): 1
> ICNTL(13) (efficiency control): 0
> ICNTL(14) (percentage of estimated workspace increase): 30
> ICNTL(18) (input mat struct): 3
> ICNTL(19) (Shur complement info): 0
> ICNTL(20) (rhs sparse pattern): 0
> ICNTL(21) (solution struct): 1
> ICNTL(22) (in-core/out-of-core facility): 0
> ICNTL(23) (max size of memory can be allocated locally):0
> ICNTL(24) (detection of null pivot rows): 0
> ICNTL(25) (computation of a null space basis): 0
> ICNTL(26) (Schur options for rhs or solution): 0
> ICNTL(27) (experimental parameter): -8
> ICNTL(28) (use parallel or sequential ordering): 2
> ICNTL(29) (parallel ordering): 0
> ICNTL(30) (user-specified set of entries in inv(A)): 0
> ICNTL(31) (factors is discarded in the solve phase): 0
> ICNTL(33) (compute determinant): 0
> ...
> linear system matrix = precond matrix:
> Matrix Object: 64 MPI processes
> type: mpiaij
> rows=1048944, cols=1048944
> * total: nonzeros=7251312, allocated nonzeros=11554449
> total number of mallocs used during MatSetValues calls =1071*
> not using I-node (on process 0) routines
>
> The particularly interesting part are last 3 lines.
> Where do these mallocs come from? Is it possible to reduce this number?
>
Yes, it looks like you are not preallocating correctly.
Matt
> Regards,
> Alexander
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111212/4038edbc/attachment.html>
More information about the petsc-dev
mailing list