[petsc-users] Allocating the diagonal for MatMPIAIJSetPreallocation
Samuel Estes
samuelestes91 at gmail.com
Fri Apr 1 12:46:14 CDT 2022
Ok thanks! I think I understand now.
On Fri, Apr 1, 2022 at 12:44 PM Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, Apr 1, 2022 at 1:08 PM Samuel Estes <samuelestes91 at gmail.com>
> wrote:
>
>> Thank you for all the help. I think I'm mostly clear now.
>>
>> I think my only remaining question relates to point 1:
>> So let's call the global matrix across all processors A and let's say
>> it's a square matrix of size n. Then each processor has a subset of the
>> rows of A so that there's in principle a contiguous rectangular chunk of A
>> on each proc. Now we want to use this preallocator matrix to determine the
>> non-zero structure of each processors chunk of A. Let's call this matrix P.
>> You say that P should also be square. But I would think that, at least
>> locally, P should be rectangular on each proc and basically have the same
>> size as the local chunks of A. Do you mean that P is square when considered
>> globally across all processors? I guess my question is a bit subtle, but
>> basically, is P essentially a parallel Mat type which exists nominally
>> across all procs, or is it a serial type object which is local to each
>> proc? I think I was viewing it in the second way since it doesn't
>> obviously require communication but I guess if you consider it in the first
>> way it makes sense that it would be nxn.
>>
>
> The idea here is that the internal structure of P does not matter. it has
> the same interface as the matrix A, so fom your point of view they are
> identical.
>
> Thanks,
>
> Matt
>
>
>> On Fri, Apr 1, 2022 at 12:00 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Fri, Apr 1, 2022 at 12:57 PM Samuel Estes <samuelestes91 at gmail.com>
>>> wrote:
>>>
>>>> Ah, so let me see if I understand this now. So basically you
>>>> preallocate for this "dummy matrix" of type MatPreallocator. But rather
>>>> than simply calling MatXXXAIJSetPreallocation() to allocate space, you
>>>> actually tell this "dummy matrix" exactly where the non-zeros will be. You
>>>> then call this MatPreallocatorPreallocate() routine to pass the non-zero
>>>> structure of the "dummy matrix" to the actual matrix that you're using for
>>>> the linear system. You then call MatSetValues() (or a variant thereof) to
>>>> set the values of this preallocated matrix.
>>>>
>>>> A couple more clarifying questions:
>>>> 1. So this MatPreallocator matrix should have a rectangular structure
>>>> of (rEnd-rStart) rows and n columns (where n is the total size of the
>>>> system? I'm assuming here that the global matrix across all processors is
>>>> square.
>>>>
>>>
>>> This "dummy matrix" will look exactly like your system matrix, meaning
>>> it is square, and you tell it the nonzeros by just calling MatSetValues(),
>>> exactly as you would for your system matrix. That is why you can reuse
>>> exactly the same code.
>>>
>>>
>>>> 2. You preallocate for the dummy matrix by calling matsetvalues rather
>>>> than the usual way of preallocating?
>>>>
>>>
>>> Yes.
>>>
>>>
>>>> 3. That's what you mean by looping twice? You loop over once using
>>>> matsetvalues to preallocate, then you have to loop using matsetvalues again
>>>> a second time to actually set the values of the parallel Mat you actually
>>>> use to solve the system?
>>>>
>>>
>>> Yes.
>>>
>>> Thanks,
>>>
>>> Matt
>>>
>>>
>>>> On Fri, Apr 1, 2022 at 11:50 AM Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 1, 2022 at 12:45 PM Samuel Estes <samuelestes91 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks! This seems like it might be what I need. I'm still a little
>>>>>> unclear on how it works though? My problem is basically that for any given
>>>>>> row, I know the total non-zeros but not how many occur in the diagonal vs
>>>>>> off-diagonal. Without knowledge of the underlying grid, I'm not sure how
>>>>>> there could be a black box utility to figure this out? Am I
>>>>>> misunderstanding how this is used?
>>>>>>
>>>>>
>>>>> So each process gets a stack of rows, [rStart, rEnd). A nonzero (r, c)
>>>>> is in the diagonal block if rStart <= c < rEnd. So if you know (r, c) for
>>>>> each nonzero, you know whether it is in the diagonal block.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>>> On Fri, Apr 1, 2022 at 11:34 AM Matthew Knepley <knepley at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Apr 1, 2022 at 12:27 PM Samuel Estes <
>>>>>>> samuelestes91 at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have a problem in which I know (roughly) the number of non-zero
>>>>>>>> entries for each row of a matrix but I don't have a convenient way of
>>>>>>>> determining whether they belong to the diagonal or off-diagonal part of the
>>>>>>>> parallel matrix. Is there some way I can just allocate the total number of
>>>>>>>> non-zeros in a row regardless of which part they belong to? I'm assuming
>>>>>>>> that this is not possible but I just wanted to check. It seems like it
>>>>>>>> should be possible in principle since the matrix is only split by the rows
>>>>>>>> but the columns of a row are all on the same processor (at least as I
>>>>>>>> understand it). Thanks!
>>>>>>>>
>>>>>>>
>>>>>>> In serial, the matrix is stored by rows. In parallel, it is split
>>>>>>> into a diagonal and off-diagonal block, so that we can overlap
>>>>>>> communication and computation in the matvec.
>>>>>>>
>>>>>>> However, we have a convenience structure for figuring this out,
>>>>>>> called MatPreallocator,
>>>>>>> https://petsc.org/main/docs/manualpages/Mat/MATPREALLOCATOR.html
>>>>>>> In my code, I wrote a loop around the code that filled up my matrix,
>>>>>>> which executed twice. On the first pass, I fed in the MatPreallocator
>>>>>>> matrix. When this finished
>>>>>>> you can call
>>>>>>> https://petsc.org/main/docs/manualpages/Mat/MatPreallocatorPreallocate.html#MatPreallocatorPreallocate
>>>>>>> on your system amtrix, then on the second
>>>>>>> pass use the system matrix. This was only a few extra lines of code
>>>>>>> for me. If you want to optimize further, you can have a flag that only
>>>>>>> computes the values on the
>>>>>>> second pass.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>> Sam
>>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220401/a955526e/attachment.html>
More information about the petsc-users
mailing list