[petsc-users] How to efficiently fill in, in parallel, a PETSc matrix from a COO sparse matrix?

Wed Jun 21 11:57:25 CDT 2023

Unfortunately, I can modify ex2 to perform the matrix fill in using only
one rank, although I have understood how it works.
Is it so hard to do that?

Thank you.

Em qua., 21 de jun. de 2023 às 12:20, Mark Adams <mfadams at lbl.gov> escreveu:

> ex2 looks the same as the code at the beginning of the thread, which looks
> fine to me, yet fails.
> (the only thing I can think of is that &v.at(i) is not doing what one
> wants)
>
> Diego: I would start with this ex2.c, add your view statement, verify;
> incrementally change ex2 to your syntax and see where it breaks.
>
> Mark
>
> On Wed, Jun 21, 2023 at 9:50 AM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Wed, Jun 21, 2023 at 9:22 AM Diego Magela Lemos <diegomagela at usp.br>
>> wrote:
>>
>>> Please, could you provide a minimal working example (or link) of how to
>>> do this?
>>>
>>
>> You can see here
>>
>>   https://petsc.org/main/src/ksp/ksp/tutorials/ex2.c.html
>>
>> that each process only sets values for the rows it owns.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thank you.
>>>
>>> Em ter., 20 de jun. de 2023 às 15:08, Matthew Knepley <knepley at gmail.com>
>>> escreveu:
>>>
>>>> On Tue, Jun 20, 2023 at 2:02 PM Diego Magela Lemos <diegomagela at usp.br>
>>>> wrote:
>>>>
>>>>> So... what do I need to do, please?
>>>>> Why am I getting wrong results when solving the linear system if the
>>>>> matrix is filled in with MatSetPreallocationCOO and MatSetValuesCOO?
>>>>>
>>>>
>>>> It appears that you have _all_ processes submit _all_ triples (i, j,
>>>> v). Each triple can only be submitted by a single process. You can fix this
>>>> in many ways. For example, an easy but suboptimal way is just to have
>>>> process 0 submit them all, and all other processes submit nothing.
>>>>
>>>>   Thanks,
>>>>
>>>>   Matt
>>>>
>>>>
>>>>> Em ter., 20 de jun. de 2023 às 14:56, Jed Brown <jed at jedbrown.org>
>>>>> escreveu:
>>>>>
>>>>>> Matthew Knepley <knepley at gmail.com> writes:
>>>>>>
>>>>>> >> The matrix entries are multiplied by 2, that is, the number of
>>>>>> processes
>>>>>> >> used to execute the code.
>>>>>> >>
>>>>>> >
>>>>>> > No. This was mostly intended for GPUs, where there is 1 process. If
>>>>>> you
>>>>>> > want to use multiple MPI processes, then each process can only
>>>>>> introduce
>>>>>> > some disjoint subset of the values. This is also how MatSetValues()
>>>>>> works,
>>>>>> > but it might not be as obvious.
>>>>>>
>>>>>> They need not be disjoint, just sum to the expected values. This
>>>>>> interface is very convenient for FE and FV methods. MatSetValues with
>>>>>> ADD_VALUES has similar semantics without the intermediate storage, but it
>>>>>> forces you to submit one element matrix at a time. Classic parallelism
>>>>>> granularity versus memory use tradeoff with MatSetValuesCOO being a clear
>>>>>> win on GPUs and more nuanced for CPUs.
>>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230621/4ab6065e/attachment.html>