<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi Barry,</p>
<p>I was referring to:</p>
<font size="2"><span style="font-size:11pt;">row 0: (0, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 1: (1, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 2: (2, 1.)</span></font><br>
<font size="2"><span style="font-size:11pt;">row 3: (3, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 4: (4, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 5: (5, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 6: (6, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 7: (7, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 8: (8, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;">
row 9: (9, 1.) </span></font><br>
<font size="2"><span style="font-size:11pt;"></span></font><font
size="2"><span style="font-size:11pt;"><b>row: 0 col: 9 val:
0.000000000000000000E+00 0.000000000000000000E+00</b></span></font>
<p><font size="2"><span style="font-size:11pt;"><b><br>
</b></span></font></p>
<font size="2"><span style="font-size:11pt;">The last line. But I
was probably mistaken - if it was inserted it would have been </span></font><br>
<p><font size="2"><span style="font-size:11pt;"><b><font size="2"><span
style="font-size:11pt;">row 0: (0, 1.), (9, 0.)</span></font></b></span></font></p>
<p><font size="2"><span style="font-size:11pt;"><font size="2"><span
style="font-size:11pt;">on the first line instead?</span></font></span></font></p>
<p><font size="2"><span style="font-size:11pt;"><font size="2"><span
style="font-size:11pt;"></span></font><font size="2"><span
style="font-size:11pt;">Thibaut</span></font><b><font
size="2"><span style="font-size:11pt;"><br>
</span></font></b></span></font></p>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 25/10/2019 14:41, Smith, Barry F.
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:1AA202EC-31BD-49A1-8AA1-2E0382D3D06D@mcs.anl.gov">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div class="BodyFragment"><font size="2"><span
style="font-size:11pt;">
<div class="PlainText"><br>
<br>
> On Oct 24, 2019, at 5:09 AM, Thibaut Appel
<a class="moz-txt-link-rfc2396E" href="mailto:t.appel17@imperial.ac.uk"><t.appel17@imperial.ac.uk></a> wrote:<br>
> <br>
> Hi Matthew,<br>
> <br>
> Thanks for having a look, your example runs just like
mine in Fortran.<br>
> <br>
> In serial, the value (0.0,0.0) was inserted whereas
it shouldn't have.<br>
<br>
I'm sorry, I don't see this for the serial case: <br>
<br>
$ petscmpiexec -n 1 ./ex241f<br>
Mat Object: 1 MPI processes<br>
type: seqaij<br>
row 0: (0, 2.) <br>
row 1: (1, 2.) <br>
row 2: (2, 2.) <br>
row 3: (3, 2.) <br>
row 4: (4, 2.) <br>
row 5: (5, 2.) <br>
row 6: (6, 2.) <br>
row 7: (7, 2.) <br>
row 8: (8, 2.) <br>
row 9: (9, 2.) <br>
<br>
> <br>
Where is the "(0.0,0.0) was inserted whereas it shouldn't
have."? <br>
<br>
<br>
Barry<br>
<br>
</div>
</span></font></div>
<div class="BodyFragment"><font size="2"><span
style="font-size:11pt;">
<div class="PlainText"><br>
> In parallel, you'll see that an error "Inserting a
new nonzero at global row/column" is triggered.<br>
> <br>
> In both cases, MatSetValues tries to insert a zero
value whereas IGNORE_ZERO_ENTRIES was set. That's what
Barry is looking into, if I'm not mistaken.<br>
> <br>
> <br>
> <br>
> Thibaut<br>
> <br>
> <br>
> <br>
> On 24/10/2019 02:31, Matthew Knepley wrote:<br>
>> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel
<a class="moz-txt-link-rfc2396E" href="mailto:t.appel17@imperial.ac.uk"><t.appel17@imperial.ac.uk></a> wrote:<br>
>> Hi both,<br>
>> <br>
>> Please find attached a tiny example (in Fortran,
sorry Matthew) that - I think - reproduces the problem we
mentioned.<br>
>> <br>
>> Let me know.<br>
>> <br>
>> Okay, I converted to C so I could understand, and
it runs fine for me:<br>
>> <br>
>> master *:~/Downloads/tmp$
PETSC_ARCH=arch-master-complex-debug make main<br>
>> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall
-Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-fstack-protector -Qunused-arguments -fvisibility=hidden
-g3 -I/PETSc3/petsc/petsc-dev/include
-I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include
-I/PETSc3/petsc/include -I/opt/X11/include `pwd`/main.c<br>
>> /PETSc3/petsc/bin/mpicc
-Wl,-multiply_defined,suppress -Wl,-multiply_defined
-Wl,suppress -Wl,-commons,use_dylibs
-Wl,-search_paths_first -Wl,-no_compact_unwind -Wall
-Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-fstack-protector -Qunused-arguments -fvisibility=hidden
-g3 -o main main.o
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib
-Wl,-rpath,/opt/X11/lib -L/opt/X11/lib
-Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0
-L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0
-Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc
-lfftw3_mpi -lfftw3 -llapack -lblas -lhdf5hl_fortran
-lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis
-ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90
-lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran
-lgcc_s.10.5 -lstdc++ -ldl<br>
>> master *:~/Downloads/tmp$ ./main<br>
>> After first assembly:<br>
>> Mat Object: 1 MPI processes<br>
>> type: seqaij<br>
>> row 0: (0, 1. + 1. i)<br>
>> row 1: (1, 1. + 1. i)<br>
>> row 2: (2, 1. + 1. i)<br>
>> row 3: (3, 1. + 1. i)<br>
>> row 4: (4, 1. + 1. i)<br>
>> row 5: (5, 1. + 1. i)<br>
>> row 6: (6, 1. + 1. i)<br>
>> row 7: (7, 1. + 1. i)<br>
>> row 8: (8, 1. + 1. i)<br>
>> row 9: (9, 1. + 1. i)<br>
>> After second assembly:<br>
>> Mat Object: 1 MPI processes<br>
>> type: seqaij<br>
>> row 0: (0, 1. + 1. i)<br>
>> row 1: (1, 1. + 1. i)<br>
>> row 2: (2, 1. + 1. i)<br>
>> row 3: (3, 1. + 1. i)<br>
>> row 4: (4, 1. + 1. i)<br>
>> row 5: (5, 1. + 1. i)<br>
>> row 6: (6, 1. + 1. i)<br>
>> row 7: (7, 1. + 1. i)<br>
>> row 8: (8, 1. + 1. i)<br>
>> row 9: (9, 1. + 1. i)<br>
>> row 0 col: 9 val: 0. + 0. i<br>
>> <br>
>> I attach my code. So then I ran your Fortran:<br>
>> <br>
>> /PETSc3/petsc/bin/mpif90 -c -Wall
-ffree-line-length-0 -Wno-unused-dummy-argument
-Wno-unused-variable -g
-I/PETSc3/petsc/petsc-dev/include
-I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include
-I/PETSc3/petsc/include -I/opt/X11/include -o main2.o
main2.F90<br>
>> /PETSc3/petsc/bin/mpif90
-Wl,-multiply_defined,suppress -Wl,-multiply_defined
-Wl,suppress -Wl,-commons,use_dylibs
-Wl,-search_paths_first -Wl,-no_compact_unwind -Wall
-ffree-line-length-0 -Wno-unused-dummy-argument
-Wno-unused-variable -g -o main2 main2.o
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib
-Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib
-Wl,-rpath,/opt/X11/lib -L/opt/X11/lib
-Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0
-L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0
-Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc
-lfftw3_mpi -lfftw3 -llapack -lblas -lhdf5hl_fortran
-lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis
-ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90
-lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran
-lgcc_s.10.5 -lstdc++ -ldl<br>
>> master *:~/Downloads/tmp$ ./main2<br>
>> After first assembly:<br>
>> Mat Object: 1 MPI processes<br>
>> type: seqaij<br>
>> row 0: (0, 1.) <br>
>> row 1: (1, 1.) <br>
>> row 2: (2, 1.) <br>
>> row 3: (3, 1.) <br>
>> row 4: (4, 1.) <br>
>> row 5: (5, 1.) <br>
>> row 6: (6, 1.) <br>
>> row 7: (7, 1.) <br>
>> row 8: (8, 1.) <br>
>> row 9: (9, 1.) <br>
>> After second assembly:<br>
>> Mat Object: 1 MPI processes<br>
>> type: seqaij<br>
>> row 0: (0, 1.) <br>
>> row 1: (1, 1.) <br>
>> row 2: (2, 1.) <br>
>> row 3: (3, 1.) <br>
>> row 4: (4, 1.) <br>
>> row 5: (5, 1.) <br>
>> row 6: (6, 1.) <br>
>> row 7: (7, 1.) <br>
>> row 8: (8, 1.) <br>
>> row 9: (9, 1.) <br>
>> row: 0 col: 9 val: 0.000000000000000000E+00
0.000000000000000000E+00<br>
>> <br>
>> I am not seeing an error. Am I not running it
correctly?<br>
>> <br>
>> Thanks,<br>
>> <br>
>> MAtt <br>
>> Thibaut<br>
>> <br>
>> <br>
>> <br>
>> On 22/10/2019 17:48, Matthew Knepley wrote:<br>
>>> On Tue, Oct 22, 2019 at 12:43 PM Thibaut
Appel via petsc-users <a class="moz-txt-link-rfc2396E" href="mailto:petsc-users@mcs.anl.gov"><petsc-users@mcs.anl.gov></a>
wrote:<br>
>>> Hi Hong,<br>
>>> <br>
>>> Thank you for having a look, I copied/pasted
your code snippet into ex28.c and the error indeed appears
if you change that col[0]. That's because you did not
allow a new non-zero location in the matrix with the
option MAT_NEW_NONZERO_LOCATION_ERR.<br>
>>> <br>
>>> I spent the day debugging the code and
already checked my calls to MatSetValues:<br>
>>> <br>
>>> For all MatSetValues calls corresponding to
the row/col location in the error messages in the
subsequent assembly, the numerical value associated with
that row/col was exactly (0.0,0.0) (complex arithmetic) so
it shouldn't be inserted w.r.t. the option
MAT_IGNORE_ZERO_ENTRIES. It seems MatSetValues still did
it anyway.<br>
>>> <br>
>>> Okay, lets solve this problem first. You say
that<br>
>>> <br>
>>> - You called MatSetOption(A,
MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE)<br>
>>> - You called MatSetValues(A, ,,,,
ADD_VALUES, ..., val) and val had a complex 0 in it<br>
>>> - PETSc tried to insert the complex 0<br>
>>> <br>
>>> This should be really easy to test in a tiny
example. Do you mind making it? If its broken, I will fix
it.<br>
>>> <br>
>>> Thanks,<br>
>>> <br>
>>> Matt<br>
>>> I was able to solve the problem by adding
MatSetOption(L,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE)
after my first assembly. However I don't know why it fixed
it as the manual seems to say this is just for efficiency
purposes<br>
>>> <br>
>>> It "should be specified after the first
matrix has been fully assembled. This option ensures that
certain data structures and communication information will
be reused (instead of regenerated during successive steps,
thereby increasing efficiency"<br>
>>> <br>
>>> So I'm still puzzled by why I got that error
in the first place. Unless "regenerated" implies resetting
some attributes of the preallocated non-zero structure /
first assembly?<br>
>>> <br>
>>> <br>
>>> <br>
>>> Thibaut<br>
>>> <br>
>>> <br>
>>> <br>
>>> On 22/10/2019 17:06, Zhang, Hong wrote:<br>
>>>> Thibaut:<br>
>>>> Check your code on MatSetValues(), likely
you set a value "to a new nonzero at global row/column
(200, 160) into matrix" L.<br>
>>>> I tested
petsc/src/mat/examples/tests/ex28.c by adding <br>
>>>> @@ -95,6 +95,26 @@ int main(int argc,char
**args)<br>
>>>> /* Compute numeric factors using same
F, then solve */<br>
>>>> for (k = 0; k < num_numfac; k++) {<br>
>>>> /* Get numeric factor of A[k] */<br>
>>>> + if (k==0) {<br>
>>>> + ierr =
MatZeroEntries(A[0]);CHKERRQ(ierr);<br>
>>>> + for (i=rstart; i<rend; i++) {<br>
>>>> + col[0] = i-1; col[1] = i; col[2]
= i+1;<br>
>>>> + if (i == 0) {<br>
>>>> + ierr =
MatSetValues(A[k],1,&i,2,col+1,value+1,INSERT_VALUES);CHKERRQ(ierr);<br>
>>>> + } else if (i == N-1) {<br>
>>>> + ierr =
MatSetValues(A[k],1,&i,2,col,value,INSERT_VALUES);CHKERRQ(ierr);<br>
>>>> + } else {<br>
>>>> + ierr =
MatSetValues(A[k],1,&i,3,col,value,INSERT_VALUES);CHKERRQ(ierr);<br>
>>>> + }<br>
>>>> + }<br>
>>>> + if (!rank) {<br>
>>>> + i = N - 1; col[0] = N - 1;<br>
>>>> + ierr =
MatSetValues(A[k],1,&i,1,col,value,INSERT_VALUES);CHKERRQ(ierr);<br>
>>>> + }<br>
>>>> + ierr =
MatAssemblyBegin(A[k],MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);<br>
>>>> + ierr =
MatAssemblyEnd(A[k],MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);<br>
>>>> + }<br>
>>>> +<br>
>>>> <br>
>>>> It works in both sequential and parallel.
If I set col[0] = 0, then I get the same error as yours.<br>
>>>> Hong<br>
>>>> Dear PETSc developers,<br>
>>>> <br>
>>>> I'm extending a validated matrix
preallocation/assembly part of my code to solve multiple
linear systems with MUMPS at each iteration of a main
loop, following the example src/mat/examples/tests/ex28.c
that Hong Zhang added a few weeks ago. The difference is
that I'm using just 1 matrix to solve different systems.<br>
>>>> <br>
>>>> I'm trying to investigate a nasty bug
arising when I try to assemble "for a second time" that
MPIAIJ matrix. The issue arises only in parallel, serial
works fine.<br>
>>>> <br>
>>>> Creating 1 MPIAIJ matrix, preallocating
it "perfectly" with the case where I have the fewest zero
entries in the non-zero structure, before getting its
symbolic factorization.<br>
>>>> <br>
>>>> Further in the main loop, I'm solely
changing its entries retaining the non-zero structure.<br>
>>>> <br>
>>>> Here is the simplified Fortran code I'm
using:<br>
>>>> <br>
>>>> ! Fill (M,N) case to ensure all non-zero
entries are preallocated<br>
>>>> CALL set_equations(M,N)<br>
>>>> <br>
>>>> CALL alloc_matrix(L)<br>
>>>> ! --> Call
MatSeqAIJSetPreallocation/MatMPIAIJSetPreallocation<br>
>>>> ! --> Sets MAT_IGNORE_ZERO_ENTRIES,
MAT_NEW_NONZERO_ALLOCATION_ERR, MAT_NO_OFF_PROC_ENTRIES to
true<br>
>>>> <br>
>>>> CALL assemble_matrix(L)<br>
>>>> ! --> Calls MatSetValues with
ADD_VALUES<br>
>>>> ! --> Call
MatAssemblyBegin/MatAssemblyEnd<br>
>>>> <br>
>>>> ! Tell PETSc that new non-zero insertions
in matrix are forbidden<br>
>>>> CALL
MatSetOption(L,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE,ierr)<br>
>>>> <br>
>>>> CALL set_mumps_parameters()<br>
>>>> <br>
>>>> ! Get symbolic LU factorization using
MUMPS<br>
>>>> CALL
MatGetFactor(L,MATSOLVERMUMPS,MAT_FACTOR_LU,F,ierr)<br>
>>>> CALL
MatGetOrdering(L,MATORDERINGNATURAL,rperm,cperm,ierr)<br>
>>>> CALL
MatLUFactorSymbolic(F,L,rperm,cperm,info,ierr)<br>
>>>> <br>
>>>> CALL initialize_right_hand_sides()<br>
>>>> <br>
>>>> ! Zero matrix entries<br>
>>>> CALL MatZeroEntries(L,ierr)<br>
>>>> <br>
>>>> ! Main loop<br>
>>>> DO itr=1, maxitr<br>
>>>> <br>
>>>> DO m = 1, M<br>
>>>> DO n = 1, N<br>
>>>> <br>
>>>> CALL set_equations(m,n)<br>
>>>> CALL assemble_matrix(L) ! ERROR HERE
when m=1, n=1, CRASH IN MatSetValues call<br>
>>>> <br>
>>>> ! Solving the linear system
associated with (m,n)<br>
>>>> CALL
MatLUFactorNumeric(F,L,info,ierr)<br>
>>>> CALL
MatSolve(F,v_rhs(m,n),v_sol(m,n),ierr) <br>
>>>> <br>
>>>> ! Process v_rhs's from v_sol's for
next iteration<br>
>>>> <br>
>>>> CALL MatZeroEntries(L,ierr)<br>
>>>> <br>
>>>> END DO<br>
>>>> END DO<br>
>>>> <br>
>>>> END DO<br>
>>>> <br>
>>>> <br>
>>>> <br>
>>>> Testing on a small case, the error I get
is<br>
>>>> <br>
>>>> [1]PETSC ERROR: ---------------------
Error Message
--------------------------------------------------------------<br>
>>>> [1]PETSC ERROR: Argument out of range<br>
>>>> [1]PETSC ERROR: Inserting a new nonzero
at global row/column (200, 160) into matrix<br>
>>>> [1]PETSC ERROR: See <a
href="https://www.mcs.anl.gov/petsc/documentation/faq.html"
moz-do-not-send="true">
https://www.mcs.anl.gov/petsc/documentation/faq.html</a>
for trouble shooting.<br>
>>>> [1]PETSC ERROR: Petsc Release Version
3.12.0, unknown <br>
>>>> [1]PETSC ERROR: Configure options
--PETSC_ARCH=cplx_gcc_debug --with-scalar-type=complex
--with-precision=double --with-debugging=1
--with-valgrind=1 --with-debugger=gdb
--with-fortran-kernels=1 --download-mpich
--download-fblaslapack --download-scalapack
--download-metis --download-parmetis --download-ptscotch
--download-mumps --download-slepc --COPTFLAGS="-O0 -g"
--CXXOPTFLAGS="-O0 -g" --FOPTFLAGS="-O0 -g -fbacktrace"<br>
>>>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ()
line 634 in
/home/thibaut/Packages/petsc/src/mat/impls/aij/mpi/mpiaij.c<br>
>>>> [1]PETSC ERROR: #2 MatSetValues() line
1375 in
/home/thibaut/Packages/petsc/src/mat/interface/matrix.c<br>
>>>> [1]PETSC ERROR: #3 User provided
function() line 0 in User file<br>
>>>> application called
MPI_Abort(MPI_COMM_SELF, 63) - process 0<br>
>>>> <br>
>>>> <br>
>>>> which I don't understand. That element
was not in the non-zero structure and wasn't preallocated.
I printed the value to be inserted at this location
(200,160) and it is exactly
(0.0000000000000000,0.0000000000000000) so this entry
should not be inserted due to MAT_IGNORE_ZERO_ENTRIES,
however it seems it is. I'm using ADD_VALUES in
MatSetValues but it is the only call where (200,160) is
inserted.<br>
>>>> <br>
>>>> <br>
>>>> <br>
>>>> - I zero the matrix entries with
MatZeroEntries which retains the non-zero structure
(checked when I print the matrix) but tried to comment the
corresponding calls.<br>
>>>> <br>
>>>> - I tried to set
MAT_NEW_NONZERO_LOCATION_ERR AND
MAT_NEW_NONZERO_ALLOCATION_ERR to PETSC_FALSE without
effect.<br>
>>>> <br>
>>>> <br>
>>>> <br>
>>>> Perhaps there's something fundamentally
wrong in my approach, in any case would you have any
suggestions to identify the exact problem?
<br>
>>>> <br>
>>>> Using PETSc 3.12.0. Thanks for your
support,<br>
>>>> <br>
>>>> <br>
>>>> <br>
>>>> Thibaut<br>
>>>> <br>
>>> <br>
>>> <br>
>>> -- <br>
>>> What most experimenters take for granted
before they begin their experiments is infinitely more
interesting than any results to which their experiments
lead.<br>
>>> -- Norbert Wiener<br>
>>> <br>
>>> <a
href="https://www.cse.buffalo.edu/~knepley/"
moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
>> <br>
>> <br>
>> -- <br>
>> What most experimenters take for granted before
they begin their experiments is infinitely more
interesting than any results to which their experiments
lead.<br>
>> -- Norbert Wiener<br>
>> <br>
>> <a href="https://www.cse.buffalo.edu/~knepley/"
moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
<br>
</div>
</span></font></div>
</blockquote>
</body>
</html>