[petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel

Thibaut Appel t.appel17 at imperial.ac.uk
Mon Oct 21 11:41:39 CDT 2019


Dear PETSc developers,

I'm extending a validated matrix preallocation/assembly part of my code 
to solve multiple linear systems with MUMPS at each iteration of a main 
loop, following the example src/mat/examples/tests/ex28.c that Hong 
Zhang added a few weeks ago. The difference is that I'm using just 1 
matrix to solve different systems.

I'm trying to investigate a nasty bug arising when I try to assemble 
"for a second time" that MPIAIJ matrix. The issue arises only in 
parallel, serial works fine.

Creating 1 MPIAIJ matrix, preallocating it "perfectly" with the case 
where I have the fewest zero entries in the non-zero structure, before 
getting its symbolic factorization.

Further in the main loop, I'm solely changing its entries *retaining the 
non-zero structure*.

Here is the simplified Fortran code I'm using:

! Fill (M,N) case to ensure all non-zero entries are preallocated
CALL set_equations(M,N)

CALL alloc_matrix(L)
   ! --> Call MatSeqAIJSetPreallocation/MatMPIAIJSetPreallocation
   ! --> Sets MAT_IGNORE_ZERO_ENTRIES, MAT_NEW_NONZERO_ALLOCATION_ERR, 
MAT_NO_OFF_PROC_ENTRIES to true

CALL assemble_matrix(L)
   ! --> Calls MatSetValues with ADD_VALUES
   ! --> Call MatAssemblyBegin/MatAssemblyEnd

! Tell PETSc that new non-zero insertions in matrix are forbidden
CALL MatSetOption(L,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE,ierr)

CALL set_mumps_parameters()

! Get symbolic LU factorization using MUMPS
CALL MatGetFactor(L,MATSOLVERMUMPS,MAT_FACTOR_LU,F,ierr)
CALL MatGetOrdering(L,MATORDERINGNATURAL,rperm,cperm,ierr)
CALL MatLUFactorSymbolic(F,L,rperm,cperm,info,ierr)

CALL initialize_right_hand_sides()

! Zero matrix entries
CALL MatZeroEntries(L,ierr)

! Main loop
DO itr=1, maxitr

   DO m = 1, M
     DO n = 1, N

     CALL set_equations(m,n)
     CALL assemble_matrix(L) ! ERROR HERE when m=1, n=1, CRASH IN 
MatSetValues call

     ! Solving the linear system associated with (m,n)
     CALL MatLUFactorNumeric(F,L,info,ierr)
     CALL MatSolve(F,v_rhs(m,n),v_sol(m,n),ierr)

     ! Process v_rhs's from v_sol's for next iteration

     CALL MatZeroEntries(L,ierr)

     END DO
   END DO

END DO


Testing on a small case, the error I get is

[1]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[1]PETSC ERROR: Argument out of range
[1]PETSC ERROR: Inserting a new nonzero at global row/column (200, 160) 
into matrix
[1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html 
for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.12.0, unknown
[1]PETSC ERROR: Configure options --PETSC_ARCH=cplx_gcc_debug 
--with-scalar-type=complex --with-precision=double --with-debugging=1 
--with-valgrind=1 --with-debugger=gdb --with-fortran-kernels=1 
--download-mpich --download-fblaslapack --download-scalapack 
--download-metis --download-parmetis --download-ptscotch 
--download-mumps --download-slepc --COPTFLAGS="-O0 -g" 
--CXXOPTFLAGS="-O0 -g" --FOPTFLAGS="-O0 -g -fbacktrace"
[1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 634 in 
/home/thibaut/Packages/petsc/src/mat/impls/aij/mpi/mpiaij.c
[1]PETSC ERROR: #2 MatSetValues() line 1375 in 
/home/thibaut/Packages/petsc/src/mat/interface/matrix.c
[1]PETSC ERROR: #3 User provided function() line 0 in User file
application called MPI_Abort(MPI_COMM_SELF, 63) - process 0


which I don't understand. That element was not in the non-zero structure 
and wasn't preallocated. I printed the value to be inserted at this 
location (200,160) and it is exactly 
(0.0000000000000000,0.0000000000000000) so this entry should not be 
inserted due to MAT_IGNORE_ZERO_ENTRIES, however it seems it is. I'm 
using ADD_VALUES in MatSetValues but it is the only call where (200,160) 
is inserted.


     - I zero the matrix entries with MatZeroEntries which retains the 
non-zero structure (checked when I print the matrix) but tried to 
comment the corresponding calls.

     - I tried to set MAT_NEW_NONZERO_LOCATION_ERR AND 
MAT_NEW_NONZERO_ALLOCATION_ERR to PETSC_FALSE without effect.


Perhaps there's something fundamentally wrong in my approach, in any 
case would you have any suggestions to identify the exact problem?

Using PETSc 3.12.0. Thanks for your support,


Thibaut

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191021/4daba87b/attachment-0001.html>


More information about the petsc-users mailing list