[petsc-users] [Ext] Re: matcreate and assembly issue

Matthew Knepley knepley at gmail.com
Fri Jul 3 12:17:40 CDT 2020


On Fri, Jul 3, 2020 at 12:52 PM Karl Lin <karl.linkui at gmail.com> wrote:

> Hi, Matthew
>
> Thanks for the reply. However, if the matrix is huge, like 13.5TB in our
> case, it will take significant amount of time to loop over insertion twice.
> Any other time and resource saving options? Thank you very much.
>

Do you think you could do it once and time it? I would be surprised if it
takes even 1% of your total runtime, and I would also
like to see the timing in that we might be able to optimize something for
you.

  Thanks,

     Matt


> Regards,
>
> Karl
>
> On Fri, Jul 3, 2020 at 10:57 AM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Fri, Jul 3, 2020 at 11:38 AM Karl Lin <karl.linkui at gmail.com> wrote:
>>
>>> Hi, Barry
>>>
>>> Thanks for the explanation. Following your tip, I have a guess. We use
>>> MatCreateAIJ to create the matrix, I believe this call will preallocate as
>>> well. Before this call we figure out the number of nonzeros per row for all
>>> rows and put those number in an array, say numNonZero. We pass numNonZero
>>> as d_nnz and o_nnz to MatCreateAIJ call, so essentially we preallocate
>>> twice as much as needed. For the process that double the memory footprint
>>> and crashed, there are a lot of values in both the diagonal and
>>> off-diagonal part for the process, so the temporary space gets filled up
>>> for both diagonal and off-diagonal parts of the matrix, also there are
>>> unused temporary space until MatAssembly, so gradually fill up the
>>> preallocated space which doubles the memory footprint. Once MatAssembly is
>>> done, the unused temporary space gets squeezed out, we return the correct
>>> memory footprint of the matrix. But before MatAssembly, large amount of
>>> unused temporary space needs to be kept because of the diagonal and
>>> off-diagonal pattern of the input. Would you say this is a plausible
>>> explanation? thank you.
>>>
>>
>> Yes. We find that it takes a very small amount of time to just loop over
>> the insertion twice, the first time counting the nonzeros. We built
>> something to do this for you:
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatPreallocatorPreallocate.html
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Regards,
>>>
>>> Karl
>>>
>>> On Fri, Jul 3, 2020 at 9:50 AM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>   Karl,
>>>>
>>>>     If a particular process is receiving values with MatSetValues()
>>>> that belong to a different process it needs to allocate temporary space for
>>>> those values. If there are many values destined for a different process
>>>> this space can be arbitrarily large. The values are not pass to the final
>>>> owning process until the MatAssemblyBegin/End calls.
>>>>
>>>>     If you have not preallocated enough room the matrix actually makes
>>>> a complete copy of itself with extra space for additional values, copies
>>>> the values over and then deletes the old matrix this the memory use can
>>>> double when the preallocation is not correct.
>>>>
>>>>
>>>>    Barry
>>>>
>>>>
>>>> On Jul 3, 2020, at 9:44 AM, Karl Lin <karl.linkui at gmail.com> wrote:
>>>>
>>>> Yes, I did. The memory check for rss computes the memory footprint of
>>>> column index using size of unsigned long long instead of int.
>>>>
>>>> For Junchao, I wonder if keeping track of which loaded columns are
>>>> owned by the current process and which loaded columns are not owned also
>>>> needs some memory storage. Just a wild thought.
>>>>
>>>> On Thu, Jul 2, 2020 at 11:40 PM Ernesto Prudencio <EPrudencio at slb.com>
>>>> wrote:
>>>>
>>>>> Karl,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> * Are you taking into account that every “integer” index might be 64
>>>>> bits instead of 32 bits, depending on the PETSc configuration / compilation
>>>>> choices for PetscInt? Ernesto. From: petsc-users
>>>>> [mailto:petsc-users-bounces at mcs.anl.gov <petsc-users-bounces at mcs.anl.gov>]
>>>>> On Behalf Of Junchao Zhang Sent: Thursday, July 2, 2020 11:21 PM To: Karl
>>>>> Lin <karl.linkui at gmail.com <karl.linkui at gmail.com>> Cc: PETSc users list
>>>>> <petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>> Subject: [Ext] Re:
>>>>> [petsc-users] matcreate and assembly issue  Is it because indices for the
>>>>> nonzeros also need memory? --Junchao Zhang     On Thu, Jul 2, 2020 at 10:04
>>>>> PM Karl Lin <karl.linkui at gmail.com <karl.linkui at gmail.com>> wrote: Hi,
>>>>> Matthew   Thanks for the reply. However, I don't really get why additional
>>>>> malloc would double the memory footprint. If I know there is only 1GB
>>>>> matrix being loaded, there shouldn't be 2GB memory occupied even if Petsc
>>>>> needs to allocate more space.   regards,   Karl   On Thu, Jul 2, 2020 at
>>>>> 8:10 PM Matthew Knepley <knepley at gmail.com <knepley at gmail.com>> wrote: On
>>>>> Thu, Jul 2, 2020 at 7:30 PM Karl Lin <karl.linkui at gmail.com
>>>>> <karl.linkui at gmail.com>> wrote: Hi, Matt   Thanks for the tip last time. We
>>>>> just encountered another issue with large data sets. This time the behavior
>>>>> is the opposite from last time. The data is 13.5TB, the total number of
>>>>> matrix columns is 2.4 billion. Our program crashed during matrix loading
>>>>> due to memory overflow in one node. As said before, we have a little memory
>>>>> check during loading the matrix to keep track of rss. The printout of rss
>>>>> in the log shows normal increase in many nodes, i.e., if we load in a
>>>>> portion of the matrix that is 1GB, after MatSetValues for that portion, rss
>>>>> will increase roughly about 1GB. On the node that has memory overflow, the
>>>>> rss increased by 2GB after only 1GB of matrix is loaded through
>>>>> MatSetValues. We are very puzzled by this. What could make the memory
>>>>> footprint twice as much as needed? Thanks in advance for any insight.   The
>>>>> only way I can imagine this happening is that you have not preallocated
>>>>> correctly, so that some values are causing additional mallocs.     Thanks,
>>>>>        Matt   Regards,   Karl    On Thu, Jun 11, 2020 at 12:00 PM Matthew
>>>>> Knepley <knepley at gmail.com <knepley at gmail.com>> wrote: On Thu, Jun 11, 2020
>>>>> at 12:52 PM Karl Lin <karl.linkui at gmail.com <karl.linkui at gmail.com>> wrote:
>>>>> Hi, Matthew   Thanks for the suggestion, just did another run and here are
>>>>> some detailed stack traces, maybe will provide some more insight:  ***
>>>>> Process received signal *** Signal: Aborted (6) Signal code:  (-6)
>>>>> /lib64/libpthread.so.0(+0xf5f0)[0x2b56c46dc5f0]  [ 1]
>>>>> /lib64/libc.so.6(gsignal+0x37)[0x2b56c5486337]  [ 2]
>>>>> /lib64/libc.so.6(abort+0x148)[0x2b56c5487a28]  [ 3]
>>>>> /libpetsc.so.3.10(PetscTraceBackErrorHandler+0xc4)[0x2b56c1e6a2d4]  [ 4]
>>>>> /libpetsc.so.3.10(PetscError+0x1b5)[0x2b56c1e69f65]  [ 5]
>>>>> /libpetsc.so.3.10(PetscCommBuildTwoSidedFReq+0x19f0)[0x2b56c1e03cf0]  [ 6]
>>>>> /libpetsc.so.3.10(+0x77db17)[0x2b56c2425b17]  [ 7]
>>>>> /libpetsc.so.3.10(+0x77a164)[0x2b56c2422164]  [ 8]
>>>>> /libpetsc.so.3.10(MatAssemblyBegin_MPIAIJ+0x36)[0x2b56c23912b6]  [ 9]
>>>>> /libpetsc.so.3.10(MatAssemblyBegin+0xca)[0x2b56c1feccda]   By
>>>>> reconfiguring, you mean recompiling petsc with that option, correct?
>>>>> Reconfiguring.     Thanks,       Matt   Thank you.   Karl   On Thu, Jun 11,
>>>>> 2020 at 10:56 AM Matthew Knepley <knepley at gmail.com <knepley at gmail.com>>
>>>>> wrote: On Thu, Jun 11, 2020 at 11:51 AM Karl Lin <karl.linkui at gmail.com
>>>>> <karl.linkui at gmail.com>> wrote: Hi, there   We have written a program using
>>>>> Petsc to solve large sparse matrix system. It has been working fine for a
>>>>> while. Recently we encountered a problem when the size of the sparse matrix
>>>>> is larger than 10TB. We used several hundred nodes and 2200 processes. The
>>>>> program always crashes during MatAssemblyBegin.Upon a closer look, there
>>>>> seems to be something unusual. We have a little memory check during loading
>>>>> the matrix to keep track of rss. The printout of rss in the log shows
>>>>> normal increase up to rank 2160, i.e., if we load in a portion of matrix
>>>>> that is 1GB, after MatSetValues for that portion, rss will increase roughly
>>>>> about that number. From rank 2161 onwards, the rss in every rank doesn't
>>>>> increase after matrix loaded. Then comes MatAssemblyBegin, the program
>>>>> crashed on rank 2160.   Is there a upper limit on the number of processes
>>>>> Petsc can handle? or is there a upper limit in terms of the size of the
>>>>> matrix petsc can handle? Thank you very much for any info.   It sounds like
>>>>> you overflowed int somewhere. We try and check for this, but catching every
>>>>> place is hard. Try reconfiguring with     --with-64-bit-indices     Thanks,
>>>>>        Matt   Regards,   Karl      -- What most experimenters take for
>>>>> granted before they begin their experiments is infinitely more interesting
>>>>> than any results to which their experiments lead. -- Norbert Wiener
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <https://urldefense.com/v3/__http:/www.cse.buffalo.edu/*knepley/__;fg!!Kjv0uj3L4nM6H-I!1KBn92fUc-8pAvJy257WTFoHD80IUf6u5iIhyL_vrliEm3psAK4KAJFCdygnPA$>
>>>>>   -- What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead. -- Norbert Wiener   https://www.cse.buffalo.edu/~knepley/
>>>>> <https://urldefense.com/v3/__http:/www.cse.buffalo.edu/*knepley/__;fg!!Kjv0uj3L4nM6H-I!1KBn92fUc-8pAvJy257WTFoHD80IUf6u5iIhyL_vrliEm3psAK4KAJFCdygnPA$>
>>>>>   -- What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead. -- Norbert Wiener   https://www.cse.buffalo.edu/~knepley/
>>>>> <https://urldefense.com/v3/__http:/www.cse.buffalo.edu/*knepley/__;fg!!Kjv0uj3L4nM6H-I!1KBn92fUc-8pAvJy257WTFoHD80IUf6u5iIhyL_vrliEm3psAK4KAJFCdygnPA$>
>>>>> Schlumberger-Private *
>>>>>
>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200703/212af196/attachment-0001.html>


More information about the petsc-users mailing list