<div dir="ltr"><div dir="ltr">On Tue, Jun 20, 2023 at 2:02 PM Diego Magela Lemos <<a href="mailto:diegomagela@usp.br">diegomagela@usp.br</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">So... what do I need to do, please?<br>Why am I getting wrong results when solving the linear system if the matrix is filled in with <font face="monospace">MatSetPreallocationCOO</font> and <font face="monospace">MatSetValuesCOO?</font></div></blockquote><div><br></div><div>It appears that you have _all_ processes submit _all_ triples (i, j, v). Each triple can only be submitted by a single process. You can fix this in many ways. For example, an easy but suboptimal way is just to have process 0 submit them all, and all other processes submit nothing.</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Em ter., 20 de jun. de 2023 às 14:56, Jed Brown <<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>> escreveu:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> writes:<br>
<br>
>> The matrix entries are multiplied by 2, that is, the number of processes<br>
>> used to execute the code.<br>
>><br>
><br>
> No. This was mostly intended for GPUs, where there is 1 process. If you<br>
> want to use multiple MPI processes, then each process can only introduce<br>
> some disjoint subset of the values. This is also how MatSetValues() works,<br>
> but it might not be as obvious.<br>
<br>
They need not be disjoint, just sum to the expected values. This interface is very convenient for FE and FV methods. MatSetValues with ADD_VALUES has similar semantics without the intermediate storage, but it forces you to submit one element matrix at a time. Classic parallelism granularity versus memory use tradeoff with MatSetValuesCOO being a clear win on GPUs and more nuanced for CPUs.<br>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>