[petsc-users] MatMultTranspose memory usage

Karl Lin karl.linkui at gmail.com
Tue Jul 30 20:29:21 CDT 2019


I checked the resident set size via /proc/self/stat

On Tue, Jul 30, 2019 at 8:13 PM Mills, Richard Tran via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi Karl,
>
> I'll let one of my colleagues who has a better understanding of exactly
> what happens with memory during matrix PETSc matrix assembly chime in, but
> let me ask how you know that the memory footprint is actually larger than
> you think it should be? Are you looking at the resident set size reported
> by a tool like 'top'? Keep in mind that even if extra buffers were
> allocated and then free()d, that resident set size of your process may stay
> the same, and only decrease when the OS's memory manager decides it really
> needs those pages for something else.
>
> --Richard
>
> On 7/30/19 5:35 PM, Karl Lin via petsc-users wrote:
>
> Thanks for the feedback, very helpful.
>
> I have another question, when I run 4 processes, even though the matrix is
> only 49.7GB, I found the memory footprint of the matrix is about 52.8GB.
> Where does these extra memory comes from? Does MatCreateAIJ still reserves
> some extra memory? I thought after MatAssembly all unused space would be
> released but in at least one of the processes, the memory footprint of the
> local matrix actually increased after MatAssembly by couple of GBs. I will
> greatly appreciate any info.
>
>
> On Tue, Jul 30, 2019 at 6:34 PM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
>
>>
>>    Thanks, this is enough information to diagnose the problem.
>>
>>    The problem is that 32 bit integers are not large enough to contain
>> the "counts", in this case the number of nonzeros in the matrix. A signed
>> integer can only be as large as PETSC_MAX_INT            2147483647.
>>
>>     You need to configure PETSc with the additional option
>> --with-64-bit-indices, then PETSc will use 64 bit integers for PetscInt so
>> you don't run out space in int for such large counts.
>>
>>
>>     We don't do a perfect job of detecting when there is overflow of int
>> which is why you ended up with crazy allocation requests like
>> 18446744058325389312.
>>
>>     I will add some more error checking to provide more useful error
>> messages in this case.
>>
>>     Barry
>>
>>    The reason this worked for 4 processes is that the largest count in
>> that case was roughly 6,653,750,976/4 which does fit into an int. PETSc
>> only needs to know the number of nonzeros on each process, it doesn't need
>> to know the amount across all the processors. In other words you may want
>> to use a different PETSC_ARCH (different configuration) for small number of
>> processors and large number depending on how large your problem is. Or you
>> can always use 64 bit integers at a little performance and memory cost.
>>
>>
>>
>>
>> > On Jul 30, 2019, at 5:27 PM, Karl Lin via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> >
>> > number of rows is 26,326,575. maximum column index is 36,416,250.
>> number of nonzero coefficients is 6,653,750,976, which amounts to 49.7GB
>> for coefficients in PetscScalar and column index in PetscInt. I can run the
>> program in 4 processes with this input but not single process. Here are the
>> snap shots of the error:
>> >
>> > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [0]PETSC ERROR: Out of memory. This could be due to allocating
>> > [0]PETSC ERROR: too large an object or bleeding by not properly
>> > [0]PETSC ERROR: destroying unneeded objects.
>> > [0]PETSC ERROR: Memory allocated 0 Memory used by process 727035904
>> > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
>> > [0]PETSC ERROR: Memory requested 18446744058325389312
>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [0]PETSC ERROR: Petsc Release Version 3.10.4, Feb, 26, 2019
>> >
>> > [0]PETSC ERROR: #1 MatSeqAIJSetPreallocation_SeqAIJ() line 3711 in
>> /petsc-3.10.4/src/mat/impls/aij/seq/aij.c
>> > [0]PETSC ERROR: #2 PetscMallocA() line 390 in
>> /petsc-3.10.4/src/sys/memory/mal.c
>> > [0]PETSC ERROR: #3 MatSeqAIJSetPreallocation_SeqAIJ() line 3711
>> in/petsc-3.10.4/src/mat/impls/aij/seq/aij.c
>> > [0]PETSC ERROR: #4 MatSeqAIJSetPreallocation() line 3649 in
>> /petsc-3.10.4/src/mat/impls/aij/seq/aij.c
>> > [0]PETSC ERROR: #5 MatCreateAIJ() line 4413 in
>> /petsc-3.10.4/src/mat/impls/aij/mpi/mpiaij.c
>> > [0]PETSC ERROR: #6 *** (my code)
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> > [0]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> > [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> > [0]PETSC ERROR: to get more information on the crash.
>> > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [0]PETSC ERROR: Signal received
>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [0]PETSC ERROR: Petsc Release Version 3.10.4, Feb, 26, 2019
>> >
>> >
>> >
>> > On Tue, Jul 30, 2019 at 10:34 AM Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Wed, Jul 31, 2019 at 3:25 AM Karl Lin via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> > Hi, Richard,
>> >
>> > We have a new question. Is there a limit for MatCreateMPIAIJ and
>> MatSetValues? What I mean is that, we tried to create a sparse matrix and
>> populate it with 50GB of data in one process, I got a crash and error
>> saying object too big. Thank you for any insight.
>> >
>> > 1) Always send the complete error.
>> >
>> > 2) It sounds like you got an out of memory error for that process.
>> >
>> >    Matt
>> >
>> > Regards,
>> >
>> > Karl
>> >
>> > On Thu, Jul 18, 2019 at 2:36 PM Mills, Richard Tran via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> > Hi Kun and Karl,
>> >
>> > If you are using the AIJMKL matrix types and have a recent version of
>> MKL, the AIJMKL code uses MKL's inspector-executor sparse BLAS routines,
>> which are described at
>> >
>> >
>> https://software.intel.com/en-us/mkl-developer-reference-c-inspector-executor-sparse-blas-routines
>> >
>> > The inspector-executor analysis routines take the AIJ (compressed
>> sparse row) format data from PETSc and then create a copy in an optimized,
>> internal layout used by MKL. We have to keep PETSc's own, AIJ
>> representation around, as it is needed for several operations that MKL does
>> not provide. This does, unfortunately, mean that roughly double (or more,
>> depending on what MKL decides to do) the amount of memory is required. The
>> reason you see the memory usage increase right when a MatMult() or
>> MatMultTranspose() operation occurs is that the we default to a "lazy"
>> approach to calling the analysis routine (mkl_sparse_optimize()) until an
>> operation that uses an MKL-provided kernel is requested. (You can use an
>> "eager" approach that calls mkl_sparse_optimize() during MatAssemblyEnd()
>> by specifying "-mat_aijmkl_eager_inspection" in the PETSc options.)
>> >
>> > If memory is at enough of a premium for you that you can't afford the
>> extra copy used by the MKL inspector-executor routines, then I suggest
>> using the usual PETSc AIJ format instead of AIJMKL. AIJ is fairly well
>> optimized for many cases (and even has some hand-optimized kernels using
>> Intel AVX/AVX2/AVX-512 intrinsics) and often outperforms AIJMKL. You should
>> try both AIJ and AIJMKL, anyway, to see which is faster for your
>> combination of problem and computing platform.
>> >
>> > Best regards,
>> > Richard
>> >
>> > On 7/17/19 8:46 PM, Karl Lin via petsc-users wrote:
>> >> We also found that if we use MatCreateSeqAIJ, then no more memory
>> increase with matrix vector multiplication. However, with
>> MatCreateMPIAIJMKL, the behavior is consistent.
>> >>
>> >> On Wed, Jul 17, 2019 at 5:26 PM Karl Lin <karl.linkui at gmail.com>
>> wrote:
>> >> MatCreateMPIAIJMKL
>> >>
>> >> parallel and sequential exhibit the same behavior. In fact, we found
>> that doing matmult will increase the memory by the size of matrix as well.
>> >>
>> >> On Wed, Jul 17, 2019 at 4:55 PM Zhang, Hong <hzhang at mcs.anl.gov>
>> wrote:
>> >> Karl:
>> >> What matrix format do you use? Run it in parallel or sequential?
>> >> Hong
>> >>
>> >> We used /proc/self/stat to track the resident set size during program
>> run, and we saw the resident set size jumped by the size of the matrix
>> right after we did matmulttranspose.
>> >>
>> >> On Wed, Jul 17, 2019 at 12:04 PM hong--- via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> >> Kun:
>> >> How do you know 'MatMultTranpose creates an extra memory copy of
>> matrix'?
>> >> Hong
>> >>
>> >> Hi,
>> >>
>> >>
>> >> I was using MatMultTranpose and MatMult to solver a linear system.
>> >>
>> >>
>> >> However we found out, MatMultTranpose create an extra memory copy of
>> matrix for its operation. This extra memory copy is not stated everywhere
>> in petsc manual.
>> >>
>> >>
>> >> This basically double my memory requirement to solve my system.
>> >>
>> >>
>> >> I remember mkl’s routine can do inplace matrix transpose vector
>> product, without transposing the matrix itself.
>> >>
>> >>
>> >> Is this always the case? Or there is way to make petsc to do inplace
>> matrix transpose vector product.
>> >>
>> >>
>> >> Any help is greatly appreciated.
>> >>
>> >>
>> >> Regards,
>> >>
>> >> Kun
>> >>
>> >>
>> >>
>> >> Schlumberger-Private
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190730/50abd009/attachment.html>


More information about the petsc-users mailing list