<div dir="ltr">Thank you for all the help. I think I'm mostly clear now. <div><br></div><div>I think my only remaining question relates to point 1:</div><div>So let's call the global matrix across all processors A and let's say it's a square matrix of size n. Then each processor has a subset of the rows of A so that there's in principle a contiguous rectangular chunk of A on each proc. Now we want to use this preallocator matrix to determine the non-zero structure of each processors chunk of A. Let's call this matrix P. You say that P should also be square. But I would think that, at least locally, P should be rectangular on each proc and basically have the same size as the local chunks of A. Do you mean that P is square when considered globally across all processors? I guess my question is a bit subtle, but basically, is P essentially a parallel Mat type which exists nominally across all procs, or is it a serial type object which is local to each proc? I think I was viewing it in the second way since it doesn't obviously require communication but I guess if you consider it in the first way it makes sense that it would be nxn.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 1, 2022 at 12:00 PM Matthew Knepley <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Apr 1, 2022 at 12:57 PM Samuel Estes <<a href="mailto:samuelestes91@gmail.com" target="_blank">samuelestes91@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Ah, so let me see if I understand this now. So basically you preallocate for this "dummy matrix" of type MatPreallocator. But rather than simply calling MatXXXAIJSetPreallocation() to allocate space, you actually tell this "dummy matrix" exactly where the non-zeros will be. You then call this MatPreallocatorPreallocate() routine to pass the non-zero structure of the "dummy matrix" to the actual matrix that you're using for the linear system. You then call MatSetValues() (or a variant thereof) to set the values of this preallocated matrix.<div><br></div><div>A couple more clarifying questions:</div><div>1. So this MatPreallocator matrix should have a rectangular structure of (rEnd-rStart) rows and n columns (where n is the total size of the system? I'm assuming here that the global matrix across all processors is square.</div></div></blockquote><div><br></div><div>This "dummy matrix" will look exactly like your system matrix, meaning it is square, and you tell it the nonzeros by just calling MatSetValues(), exactly as you would for your system matrix. That is why you can reuse exactly the same code.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>2. You preallocate for the dummy matrix by calling matsetvalues rather than the usual way of preallocating?</div></div></blockquote><div><br></div><div>Yes.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>3. That's what you mean by looping twice? You loop over once using matsetvalues to preallocate, then you have to loop using matsetvalues again a second time to actually set the values of the parallel Mat you actually use to solve the system?</div></div></blockquote><div><br></div><div>Yes.</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 1, 2022 at 11:50 AM Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Apr 1, 2022 at 12:45 PM Samuel Estes <<a href="mailto:samuelestes91@gmail.com" target="_blank">samuelestes91@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thanks! This seems like it might be what I need. I'm still a little unclear on how it works though? My problem is basically that for any given row, I know the total non-zeros but not how many occur in the diagonal vs off-diagonal. Without knowledge of the underlying grid, I'm not sure how there could be a black box utility to figure this out? Am I misunderstanding how this is used? </div></blockquote><div><br></div><div>So each process gets a stack of rows, [rStart, rEnd). A nonzero (r, c) is in the diagonal block if rStart <= c < rEnd. So if you know (r, c) for each nonzero, you know whether it is in the diagonal block.</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 1, 2022 at 11:34 AM Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Apr 1, 2022 at 12:27 PM Samuel Estes <<a href="mailto:samuelestes91@gmail.com" target="_blank">samuelestes91@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>I have a problem in which I know (roughly) the number of non-zero entries for each row of a matrix but I don't have a convenient way of determining whether they belong to the diagonal or off-diagonal part of the parallel matrix. Is there some way I can just allocate the total number of non-zeros in a row regardless of which part they belong to? I'm assuming that this is not possible but I just wanted to check. It seems like it should be possible in principle since the matrix is only split by the rows but the columns of a row are all on the same processor (at least as I understand it). Thanks!</div></div></blockquote><div><br></div><div>In serial, the matrix is stored by rows. In parallel, it is split into a diagonal and off-diagonal block, so that we can overlap communication and computation in the matvec.</div><div><br></div><div>However, we have a convenience structure for figuring this out, called MatPreallocator, <a href="https://petsc.org/main/docs/manualpages/Mat/MATPREALLOCATOR.html" target="_blank">https://petsc.org/main/docs/manualpages/Mat/MATPREALLOCATOR.html</a></div><div>In my code, I wrote a loop around the code that filled up my matrix, which executed twice. On the first pass, I fed in the MatPreallocator matrix. When this finished</div><div>you can call <a href="https://petsc.org/main/docs/manualpages/Mat/MatPreallocatorPreallocate.html#MatPreallocatorPreallocate" target="_blank">https://petsc.org/main/docs/manualpages/Mat/MatPreallocatorPreallocate.html#MatPreallocatorPreallocate</a> on your system amtrix, then on the second</div><div>pass use the system matrix. This was only a few extra lines of code for me. If you want to optimize further, you can have a flag that only computes the values on the</div><div>second pass.</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Sam</div></div></blockquote></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>
</blockquote></div>