<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<div dir="ltr">
<div dir="ltr">Thibaut:<br>
</div>
<div>Check your code on MatSetValues(), likely you set a value "to a new nonzero at global row/column (200, 160) into matrix" L.</div>
<div>I tested petsc/src/mat/examples/tests/ex28.c by adding </div>
<div>@@ -95,6 +95,26 @@ int main(int argc,char **args)<br>
/* Compute numeric factors using same F, then solve */<br>
for (k = 0; k < num_numfac; k++) {<br>
/* Get numeric factor of A[k] */<br>
+ if (k==0) {<br>
+ ierr = MatZeroEntries(A[0]);CHKERRQ(ierr);<br>
+ for (i=rstart; i<rend; i++) {<br>
+ col[0] = i-1; col[1] = i; col[2] = i+1;<br>
+ if (i == 0) {<br>
+ ierr = MatSetValues(A[k],1,&i,2,col+1,value+1,INSERT_VALUES);CHKERRQ(ierr);<br>
+ } else if (i == N-1) {<br>
+ ierr = MatSetValues(A[k],1,&i,2,col,value,INSERT_VALUES);CHKERRQ(ierr);<br>
+ } else {<br>
+ ierr = MatSetValues(A[k],1,&i,3,col,value,INSERT_VALUES);CHKERRQ(ierr);<br>
+ }<br>
+ }<br>
+ if (!rank) {<br>
+ i = N - 1; col[0] = N - 1;<br>
+ ierr = MatSetValues(A[k],1,&i,1,col,value,INSERT_VALUES);CHKERRQ(ierr);<br>
+ }<br>
+ ierr = MatAssemblyBegin(A[k],MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);<br>
+ ierr = MatAssemblyEnd(A[k],MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);<br>
+ }<br>
+<br>
</div>
<div><br>
</div>
<div>It works in both sequential and parallel. If I set col[0] = 0, then I get the same error as yours.</div>
<div>Hong</div>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Dear PETSc developers,</p>
<p>I'm extending a validated matrix preallocation/assembly part of my code to solve multiple linear systems with MUMPS at each iteration of a main loop, following the example src/mat/examples/tests/ex28.c that Hong Zhang added a few weeks ago. The difference
is that I'm using just 1 matrix to solve different systems.<br>
</p>
<p>I'm trying to investigate a nasty bug arising when I try to assemble "for a second time" that MPIAIJ matrix. The issue arises only in parallel, serial works fine.</p>
<p>Creating 1 MPIAIJ matrix, preallocating it "perfectly" with the case where I have the fewest zero entries in the non-zero structure, before getting its symbolic factorization.</p>
<p>Further in the main loop, I'm solely changing its entries <b>retaining the non-zero structure</b>.<br>
</p>
<p>Here is the simplified Fortran code I'm using:<br>
</p>
<p><tt>! Fill (M,N) case to ensure all non-zero entries are preallocated</tt><tt><br>
</tt><tt>CALL set_equations(M,N)</tt><tt><br>
</tt><tt><br>
</tt><tt>CALL alloc_matrix(L)</tt><tt><br>
</tt><tt> ! --> Call MatSeqAIJSetPreallocation/MatMPIAIJSetPreallocation</tt><tt><br>
</tt><tt> ! --> Sets MAT_IGNORE_ZERO_ENTRIES, MAT_NEW_NONZERO_ALLOCATION_ERR, MAT_NO_OFF_PROC_ENTRIES to true</tt><tt><br>
</tt><tt><br>
</tt><tt>CALL assemble_matrix(L)</tt><tt><br>
</tt><tt> ! --> Calls MatSetValues with ADD_VALUES</tt><tt><br>
</tt><tt> ! --> Call MatAssemblyBegin/MatAssemblyEnd</tt><tt><br>
</tt><tt><br>
</tt><tt>! Tell PETSc that new non-zero insertions in matrix are forbidden</tt><tt><br>
</tt><tt>CALL MatSetOption(L,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE,ierr)</tt><tt><br>
</tt><tt><br>
</tt><tt>CALL set_mumps_parameters()</tt><tt><br>
</tt><tt><br>
</tt><tt>! Get symbolic LU factorization using MUMPS</tt><tt><br>
</tt><tt>CALL MatGetFactor(L,MATSOLVERMUMPS,MAT_FACTOR_LU,F,ierr)</tt><tt><br>
</tt><tt>CALL MatGetOrdering(L,MATORDERINGNATURAL,rperm,cperm,ierr)</tt><tt><br>
</tt><tt>CALL MatLUFactorSymbolic(F,L,rperm,cperm,info,ierr)</tt><tt><br>
</tt><tt><br>
</tt><tt>CALL initialize_right_hand_sides()</tt><tt><br>
</tt><tt><br>
</tt><tt>! Zero matrix entries</tt><tt><br>
</tt><tt>CALL MatZeroEntries(L,ierr)</tt><tt><br>
</tt><tt><br>
</tt><tt>! Main loop</tt><tt><br>
</tt><tt>DO itr=1, maxitr</tt><tt><br>
</tt><tt><br>
</tt><tt> DO m = 1, M</tt><tt><br>
</tt><tt> DO n = 1, N</tt><tt><br>
</tt><tt><br>
</tt><tt> CALL set_equations(m,n)</tt><tt><br>
</tt><tt> CALL assemble_matrix(L) ! ERROR HERE when m=1, n=1, CRASH IN MatSetValues call</tt><tt><br>
</tt><tt><br>
</tt><tt> ! Solving the linear system associated with (m,n)</tt><tt><br>
</tt><tt> CALL MatLUFactorNumeric(F,L,info,ierr)</tt><tt><br>
</tt><tt> CALL MatSolve(F,v_rhs(m,n),v_sol(m,n),ierr) </tt><tt><br>
</tt><tt><br>
</tt><tt> ! Process v_rhs's from v_sol's for next iteration</tt><tt><br>
</tt><tt><br>
</tt><tt> CALL MatZeroEntries(L,ierr)</tt><tt><br>
</tt><tt><br>
</tt><tt> END DO</tt><tt><br>
</tt><tt> END DO</tt><tt><br>
</tt><tt><br>
</tt><tt>END DO</tt></p>
<p><tt><br>
</tt></p>
<p>Testing on a small case, the error I get is</p>
<tt>[1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------</tt><tt><br>
</tt><tt>[1]PETSC ERROR: Argument out of range</tt><tt><br>
</tt><tt>[1]PETSC ERROR: Inserting a new nonzero at global row/column (200, 160) into matrix</tt><tt><br>
</tt><tt>[1]PETSC ERROR: See <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html" target="_blank">
https://www.mcs.anl.gov/petsc/documentation/faq.html</a> for trouble shooting.</tt><tt><br>
</tt><tt>[1]PETSC ERROR: Petsc Release Version 3.12.0, unknown </tt><tt><br>
</tt><tt>[1]PETSC ERROR: Configure options --PETSC_ARCH=cplx_gcc_debug --with-scalar-type=complex --with-precision=double --with-debugging=1 --with-valgrind=1 --with-debugger=gdb --with-fortran-kernels=1 --download-mpich --download-fblaslapack --download-scalapack
--download-metis --download-parmetis --download-ptscotch --download-mumps --download-slepc --COPTFLAGS="-O0 -g" --CXXOPTFLAGS="-O0 -g" --FOPTFLAGS="-O0 -g -fbacktrace"</tt><tt><br>
</tt><tt>[1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 634 in /home/thibaut/Packages/petsc/src/mat/impls/aij/mpi/mpiaij.c</tt><tt><br>
</tt><tt>[1]PETSC ERROR: #2 MatSetValues() line 1375 in /home/thibaut/Packages/petsc/src/mat/interface/matrix.c</tt><tt><br>
</tt><tt>[1]PETSC ERROR: #3 User provided function() line 0 in User file</tt><tt><br>
</tt><tt>application called MPI_Abort(MPI_COMM_SELF, 63) - process 0</tt><tt><br>
</tt>
<p> <br>
</p>
<p>which I don't understand. That element was not in the non-zero structure and wasn't preallocated. I printed the value to be inserted at this location (200,160) and it is exactly (0.0000000000000000,0.0000000000000000) so this entry should not be inserted
due to MAT_IGNORE_ZERO_ENTRIES, however it seems it is. I'm using ADD_VALUES in MatSetValues but it is the only call where (200,160) is inserted.<br>
</p>
<p><br>
</p>
<p> - I zero the matrix entries with MatZeroEntries which retains the non-zero structure (checked when I print the matrix) but tried to comment the corresponding calls.<br>
</p>
<p> - I tried to set MAT_NEW_NONZERO_LOCATION_ERR AND MAT_NEW_NONZERO_ALLOCATION_ERR to PETSC_FALSE without effect.</p>
<p><br>
</p>
<p>Perhaps there's something fundamentally wrong in my approach, in any case would you have any suggestions to identify the exact problem?
<br>
</p>
<p>Using PETSc 3.12.0. Thanks for your support,</p>
<p><br>
</p>
<p>Thibaut<br>
</p>
</div>
</blockquote>
</div>
</div>
</body>
</html>