<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Feb 17, 2022 at 11:46 AM Bojan Niceno <<a href="mailto:bojan.niceno.scientist@gmail.com">bojan.niceno.scientist@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Dear all,<br><br><br>I am experiencing difficulties when using PETSc in parallel in an unstructured CFD code. It uses CRS format to store its matrices. I use the following sequence of PETSc call in the hope to get PETSc solving my linear systems in parallel. Before I continue, I would just like to say that the code is MPI parallel since long time ago, and performs its own domain decomposition through METIS, and it works out its communication patterns which work with its home-grown (non-PETSc) linear solvers. Anyhow, I issue the following calls:<br><br><font face="monospace" color="#0000ff">err = PetscInitialize(0, NULL, (char*)0, help);<br><br>err = MatCreate(MPI_COMM_WORLD, A);<br></font>In the above, I use MPI_COMM_WORLD instead of PETSC_COMM_SELF because the call to MPI_Init is invoked outside of PETSc, from the main program.<br><br><font face="monospace" color="#0000ff">err = MatSetSizes(A, m, m, M, M);<br></font>Since my matrices are exclusively square, M is set to the total number of computational cells, while m is equal to the number of computational cells within each subdomain/processor. (Not all processors necessarily have the same m, it depends on domain decomposition.) I do not distinguish between m (M) and n (N) since matrices are all square. Am I wrong to assume that?<br><br><font color="#0000ff" face="monospace">err = MatSetType(A, MATAIJ);<br></font>I set the matrix to be of type MATAIJ, to cover runs on one and on more processors. By the way, on one processors everything works fine<br><br><font face="monospace" color="#0000ff">err = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz);<br>err = MatSeqAIJSetPreallocation(A, 0, d_nnz);<br></font>The two lines above specify matrix preallocation. Since d_nz and o_nz vary from cell to cell (row to row), I set them to zero and provide arrays with number of diagonal and off diagonal zeroes instead. To my understanding, that is legit since d_nz and o_nz are neglected if d_nnz and o_nnz are provided. Am I wrong?<br><br>Finally, inside a loop through rows and columns I call:<br><br><font face="monospace" color="#0000ff">err = MatSetValue(A, row, col, value, INSERT_VALUES);<br></font>Here I make sure that row and col point to global cell (unknown) numbers.<br><br>Yet, when I run the code on more than one processor, I get the error:<br><br>[<font color="#ff0000">3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------<br>[3]PETSC ERROR: Argument out of range<br>[3]PETSC ERROR: New nonzero at (21,356) caused a malloc<br>Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check<br><br>[3]PETSC ERROR: #1 MatSetValues_MPIAIJ() at /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517<br>[3]PETSC ERROR: #2 MatSetValues() at /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398<br>[3]PETSC ERROR: #3 MatSetValues_MPIAIJ() at /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517<br>[3]PETSC ERROR: #4 MatSetValues() at /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398</font><br><br>and so forth, for roughly 10% of all matrix entries. I checked if these errors occur only for off-diagonal parts of the matrix entries, but that is not the case.<br><br>Error code is 63; PETSC_ERR_ARG_OUTOFRANGE<br><br>Does anyone have an idea what am I doing wrong? Is any of my assumptions above (like thinking n(N) is always m(M) for square matrices, that I can send zeros as d_nz and o_nz if I provide arrays d_nnz[] and o_nnz[] wrong? </div></blockquote><div><br></div><div>That is correct.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Any idea how to debug it, where to look for an error?<div><br></div></div></blockquote><div><br></div><div>I would guess that you are counting your o_nnz incorrectly. It looks like a small number of equations per process because the 4th process has row 21, apparently. Does that sound right?<br></div><div><br></div>And column 356 is going to be in the off-diagonal block (ie, "o"). I would start with a serial matrix and run with -info. This will be noisy but you will see things like"number of unneeded..." that you can verify that you have set d_nnz perfectly (there should be 0 unneeded).</div><div class="gmail_quote">Then try two processors. If it fails you could a print statement in everytime the row (eg, 21) is added to and check what your code for computing o_nnz is doing.</div><div class="gmail_quote"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>I am carefully checking all the data I send to PETSc functions and looks correct to me, but maybe I lack some fundamental understanding of what should be provided to PETSc, as I write above?<br></div></div></blockquote><div><br></div><div>It is a bit confusing at first. The man page gives a concrete example <a href="https://petsc.org/main/docs/manualpages/Mat/MatCreateAIJ.html#MatCreateAIJ">https://petsc.org/main/docs/manualpages/Mat/MatCreateAIJ.html#MatCreateAIJ</a></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><br><br> Best regards,<br><br><br> Bojan Niceno<br><br></div></div>
</blockquote></div></div>