[petsc-users] Various Questions Regarding PETSC

Mohammed Mostafa mo7ammedmostafa at gmail.com
Fri Jul 19 01:34:30 CDT 2019


>
> 1)  Times
>      a) You run the code that generates the matrix entries and simply
> comment out the calls to MatSetValues() and it takes
>           .16747 seconds.

Yes

>
>       b) When you include the MatSetValue calls it takes 3.8820.
>
Yes

>       c)  When you instead insert into an eigen matrix (which is also
> parallel?)
>
No, The Eigen Matrix is just a local matrix to each process and it doesn't
support MPI ( at least how I use it ), I just used to establish a baseline
for the insertion cost in CSR matrix ( since I don't set off-processor rows
anyway)
And it both cases it is a local insertion until MatAssembly where
communication and inter-processor copy should happen
After I insert in Eigen and I copy again the Eigen matrix to the  Petsc
Matrix, using *MatUpdateMPIAIJWithArrays() which costs me an
additional 0*.20326
secs.*,* which is what I intend to use for linear system solution using
ksp, pc

>           it takes 2.8727 secs.
>       d)  When you in your code put all the values into a CSR matrix and
> then call MatUpdateMPIAIJWithArrays() it takes
>            2.8727 secs + .20326 secs.
>
Yes

>       e) When you modify MatSetValues_MPIAIJ into MatSetValues_MPIAIJ2()
> that allows multiple rows that do not need matching
>            columns it takes 2.6170 secs.

Yes
1e) When you use MatSetValues_MPIAIJ2() are you directly calling
MatSetValues_MPIAIJ2() or are you still calling MatSetValues() and
       having it call MatSetValues_MPIAIJ2()? That is did you replace the
setvalues() pointer in the function table?
Yes, I am calling  *MatSetValues2_MPIAIJ()* directly  not *MatSetValues()*,
I just wanted to confirm the behavior first and if I am getting any
performance gain
I wanted to be as minimally intrusive to the code as possible right now to
avoid causing problems elsewhere since I have just started learning how it
works.

1*) For the cases b,c, and e are calling MatSetValues*() multiple times for
> the same row or exactly once for a given row?

I am only calling MatSetValues exactly once per row and each time I set 4
nonzero values although for the future  I may set as many as 7 non-zero
values
however, in *MatSetValues2_MPIAIJ() I call it once every *nrow_buffer
The call is like this

		PetscLogEventBegin(FillPetscMatEvent, 0, 0, 0, 0);
		auto mat = &PhiEqnSolver.mat;
		int buffer_size = 100, n_inner, Rows_left,nj_per_row_i;
		int n_outer = int(ceil(double(nCells) / double(buffer_size)));
		cell = 0;
		Rows_left = nCells;
		for (int i_outer = 0; i_outer < n_outer; i_outer++) {
			n_inner = min(buffer_size,Rows_left );
			k = 0;
			for (i = 0; i < n_inner; i++) {
				cell_gindex = cellGIndex[cell];
				i_index[i] = cell_gindex;
				// set the system RHS = f * vol
				Eqn_rhs_ptr[cell] += cellVol[cell] * rhsFieldptr[cell];

				istart = cellIFace[cell];
				iend = cellIFace[cell + 1];

				diagCOEF = 0.0;
				nj_per_row_i =0;
				for (fptr = istart; fptr < iend; fptr++) {
					face = cellFace[fptr];
					nghbr = faceNeighbor[face];
					COEF = Coef[face]; // flux coef computed on the faces of each
cell in the domain
					if (nghbr < 0) {
						bc_type = faceBCType[face];
						if (bc_type == PROCESSOR) {
							neighbor_cell = facePhysTag[face];
							j_index[k] = neighbor_cell;
							coefvalues[k] = COEF;
							diagCOEF -= COEF;
							k++;
							nj_per_row_i++;
						} else {
							diagCOEF -= COEF;
						}
					} else {
						owner = faceOwner[face];
						neighbor_cell = owner + nghbr - cell;
						j_index[k] = cellGIndex[neighbor_cell];
						coefvalues[k] = COEF;
						diagCOEF -= COEF;
						k++;
						nj_per_row_i++;
					}
				}
				j_index[k] = cell_gindex;
				coefvalues[k] = diagCOEF;
				k++;
				nj_per_row_i++;
				nj_perRow[i] = nj_per_row_i;
				cell++;
				Rows_left--;
			}
			MatSetValues2_MPIAIJ(*A, n_inner, i_index, nj_perRow, j_index,
coefvalues, INSERT_VALUES);
		}
		PetscLogEventEnd(FillPetscMatEvent, 0, 0, 0, 0);


2) You have done the initial matrix preallocation with
> MatCreateMPIAIJWithArrays(); and so have perfect preallocation, that is
> when you run with
>     -info and grep for malloc it says 0 mallocs are performed?

Yes no mallocs and no unused non-zeros
check the attached file <info_log_petsc_setValues.txt>

 3) Are you using ADD_VALUES or INSERT_VALUES?


here are the times for nrow_buffer = 50
With InsertValues
FillCSTMat_with_MatSetValues     100 1.0 2.5512e+00 1.0 0.00e+00 0.0
0.0e+00 0.0e+00 0.0e+00 16  0  0  0  0  93  0  0  0  0     0
WithAddValues
FillPetscMat     100 1.0 2.6654e+00 1.0 7.44e+07 1.0 0.0e+00 0.0e+00
0.0e+00 17 53  0  0  0  90100  0  0  0   167
Additionally I have to zero the matrix which costs
MatZeroEntries       100 1.0 9.5259e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  1  0  0  0  0   3  0  0  0  0     0
 INSERT_VALUES is abit faster than ADD_VALUES the difference I saw can be
attributed to the cost computation ( addition)

> 4) In each MatSetValues() call are you setting values into all nonzero
> locations of that row, or only a subset of nonzero locations in that row?
>     Are the column indices for that row you pass in monotonically
> increasing or not?

Yes I setting new values for all non-zero values in that row, not a subset
of it
No, the column indices are Not monotonically increasing as you can see from
the above code the column indices come from the mesh connectivity come.
I suppose I can sort them and make line an ordering map for each row and
store it and use that to rearrange the values before inserting them.
But would it really matter that much ?! From memory access point, I suppose
it could matter.

5) Did you ./configure with --with-debugging=0?

PETSC_ARCH=release3 *-with-debugging=0* COPTFLAGS="-O3 -march=native
-mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3
-march=native -mtune=native" --with-cc=mpicc --with-cxx=mpicxx
--with-fc=mpif90 --download-metis --download-hypre

As for the second Message

> 1) Could you please send nrow_buffer = 1
>
Yes, Of course.
As I said I made very small changes and now I trying to dig deeper into the
code to see if there is more to do
 Should I attach the modified < mpiaij.c and petscmat.h > to this message
thread or send it directly to you

> 2) What time do you get if you use MatSetValues2_MPIAIJ() with an
> nrow_buffer = 1 ?

Honestly, I didn't try that before.
First I replaced the call in the code above to* MatSetValues() *and set
nrow_buffer = 1 so it should be equivalent to before and here is the time
FillPetscMat     100 1.0 *3.6594e+00* 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 22  0  0  0  0  96  0  0  0  0     0
Now I used  *MatSetValues2_MPIAIJ()* with an nrow_buffer = 1
FillPetscMat     100 1.0* 3.3920e+00* 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 21  0  0  0  0  96  0  0  0  0     0

For some reason, *MatSetValues2_MPIAIJ *is a bit faster.
Sorry for the long reply.

Thanks,
Kamra


On Fri, Jul 19, 2019 at 9:48 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   1) Could you please send MatSetValues2_MPIAIJ()?
>
>    2) What time do you get if you use MatSetValues2_MPIAIJ() with an
> nrow_buffer = 1 ?
>
>    Thanks
>
>    Barry
>
>
> > On Jul 18, 2019, at 2:01 AM, Mohammed Mostafa via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> > Hello everyone,
> > Since I already established a baseline to compare the cost of inserting
> values in PetscMatrix.
> > And based on the hint of the number of values inserted in the matrix
> each time
> >  2) Can you tell me how many values are inserted?
> > I took a look at the source code for "MatSetValues_MPIAIJ" and found
> that it seems to be designed for
> > FEM assembly of the global matrix from element matrices(from what I
> remember from undergrad) since it requires for the setting of multiple rows
> that the col indices should be the same
> >
> > So to increase the number of inserted values I need to modify the
> implementation of  MatSetValues_MPIAIJ to allow for more values to be
> inserted
> > So I made a copy of the function "MatSetValues_MPIAIJ" in
> "src/mat/impls/aij/mpi/mpiaij.c" and named it "MatSetValues2_MPIAIJ"
> > I made some minor changes to allow for inserting multiple rows
> regardless of whether they have the same col indices
> >
> > So what I do now is store the data for multiple rows and then insert
> them all together and I figured I would see how the performance would be.
> > I tried different number of rows to be inserted i.e. nrow_buffer = [2,
> 5, 10, 20, 50, 100]
> > So now instead of calling   "MatSetValues" for every row in the matrix ,
> I call "MatSetValues2_MPIAIJ" every  (nrow_buffer)th which should allow for
> some performance improvement
> > the results are as follows
> > First Remember that before
> > 1-computation and insertion into petsc matrix
> > FillPetscMat_with_MatSetValues              100 1.0 3.8820e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  96  0  0  0  0     0
> > 2-computation and insertion into eigen matrix
> > FilEigenMat                                               100 1.0
> 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  88  0
> 0  0  0     0
> >
> > Now
> > nrow_buffer = 2
> > FillPetscMat_with_MatSetValues2                  100 1.0 3.3321e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 20  0  0  0  0  95  0  0  0  0     0
> > nrow_buffer = 5
> > FillPetscMat_with_MatSetValues2                  100 1.0 2.8842e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  94  0  0  0  0     0
> > nrow_buffer = 10
> > FillPetscMat_with_MatSetValues2                  100 1.0 2.7669e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  93  0  0  0  0     0
> > nrow_buffer = 20
> > FillPetscMat_with_MatSetValues2                  100 1.0 2.6834e+00 1.0
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16  0  0  0  0  93  0  0  0  0     0
> > nrow_buffer = 50
> > FillPetscMat_with_MatSetValues2                  100 1.0 2.6862e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  93  0  0  0  0     0
> > nrow_buffer = 100
> > FillPetscMat_with_MatSetValues2                  100 1.0 2.6170e+00 1.0
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16  0  0  0  0  93  0  0  0  0     0
> >
> > As to be expected, with increasing the number of rows to be inserted the
> overhead reduces until it basically stagnates somewhere between 20-50
> > The modifications I made based on MatSetValues_MPIAIJ are very small but
> the effect is significant ( drop in insertion cost by 33%) and it is now
> even faster than Eigen(baseline) at insertion with my naive usage.
> > For now I am quite satisfied with the outcome. There is probably some
> room for improvement but for now this is enough.
> >
> > Thanks,
> > Kamra
> >
> > On Thu, Jul 18, 2019 at 12:34 AM Mohammed Mostafa <
> mo7ammedmostafa at gmail.com> wrote:
> > Regarding the first point
> > 1) Are you timing only the insertion of values, or computation and
> insertion?
> > I am timing both, the computation and insertion of values but as I said
> I timed three scenarios
> > 1-computation only and no insertion
> > Computation_no_insertion                            100 1.0 1.6747e-01
> 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0  22  0  0  0  0
>  0
> > 2-computation and insertion into petsc matrix
> > FillPetscMat_with_MatSetValues              100 1.0 3.8820e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  96  0  0  0  0     0
> > 3-computation and insertion into eigen matrix
> > FilEigenMat                                               100 1.0
> 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  88  0
> 0  0  0     0
> > I timed 100 times to get a reasonably accurate timings
> >
> > as for the second point
> >  2) Can you tell me how many values are inserted?
> > For a total of nearly 186062 rows per process (with  6 processes in
> total, the matrix global size is 1116376)
> > In most rows ( about 99.35%)  4 non-zeros per rows and in the remaining
> 0.35% 2 or 3 non-zeros per row
> > the number of off-diagonal onnz in total is 648 nnz
> > So I insert nearly 4 values 186062 times ~= 744248 times per mpi process
> >
> >
> > Thanks,
> > Kamra
> >
> > On Wed, Jul 17, 2019 at 11:59 PM Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Wed, Jul 17, 2019 at 8:51 AM Mohammed Mostafa <
> mo7ammedmostafa at gmail.com> wrote:
> > Sorry for the confusion
> > First I fully acknowledge that setting Matrix non-zeros or copying in
> general is not cheap and memory access pattern can play an important role.
> > So to establish a baseline to compare with, I tried setting the same
> matrix but in an Eigen Sparse Matrix  and the timings are as follows
> > FillPetscMat_with_MatSetValues              100 1.0 3.8820e+00 1.1
> 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  96  0  0  0  0     0
> > FilEigenMat                                               100 1.0
> 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  88  0
> 0  0  0     0
> >
> > Great. This helps. Two things would help me narrow down what is
> happening.
> >
> >   1) Are you timing only the insertion of values, or computation and
> insertion?
> >
> >   2) Can you tell me how many values are inserted?
> >
> >   Thanks,
> >
> >     Matt
> >
> > I used the same code but simply filled a different Matrix something like
> >
> > for ( i =0; i < nRows;i++)
> > {
> > //
> > .......
> > // Some code to get  j_index, coefvalues
> > // Method1
> > MatSetValues(A, 1, &cell_global_index, nnz_per_row, j_index, coefvalues,
> INSERT_VALUES);
> >
> > //Method2
> > for ( int k = 0;k < nnz_per_row; k++)
> >      EigenMat.coeffRef(i, j_index[k] ) = coefvalues[k];
> >
> >  }
> > Please note that only one of the two methods is being used at a time.
> Also, I separately time the code section used to <  j_index,   coefvalues>
> but simpling disabling both Method1 and Method2.
> > I found the cost to be trivial in comparison to when either one of the
> methods is used.
> > I used Eigen out of convenience since I used for some vector and tensor
> arithmetics somewhere else in the code and it may not be the best choice.
> > Since in PetscMatrix we technically fill two matrices: diagonal and
> off-diagonal so I expected some difference but is that normal or am I
> missing something. ?
> > Maybe some setting or MatOption I should be using so far this what I
> have been using
> >
> > MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, local_size, local_size,
> PETSC_DETERMINE,
> > PETSC_DETERMINE, ptr, j , v, A);
> >  MatSetOption(A,MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE);
> > MatSetOption(A,MAT_IGNORE_OFF_PROC_ENTRIES,PETSC_TRUE);
> > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE);
> > MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE);
> > MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE);
> > MatSetOption(A,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE);
> > MatSetUp(A);
> > MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
> > MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
> >
> > Thanks,
> > Kamra
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190719/c15ad235/attachment-0001.html>
-------------- next part --------------
[0] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 0 Total message length 0
[5] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 0 Total message length 0
[1] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 0 Total message length 0
[2] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 0 Total message length 0
[3] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 0 Total message length 0
[4] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 0 Total message length 0
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[4] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[5] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[4] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 930313; storage space: 0 unneeded,649 used
[4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[4] MatCheckCompressedRow(): Found the ratio (num_zerorows 185414)/(num_localrows 186063) > 0.6. Use CompressedRow routines.
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 186064 X 930312; storage space: 0 unneeded,685 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[2] MatCheckCompressedRow(): Found the ratio (num_zerorows 185379)/(num_localrows 186064) > 0.6. Use CompressedRow routines.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 930314; storage space: 0 unneeded,648 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 185414)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
[2] MatSeqAIJCheckInode(): Found 38030 nodes of 186064. Limit used: 5. Using Inode routines
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 930313; storage space: 0 unneeded,658 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[3] MatCheckCompressedRow(): Found the ratio (num_zerorows 185405)/(num_localrows 186063) > 0.6. Use CompressedRow routines.
[4] MatSeqAIJCheckInode(): Found 37991 nodes of 186063. Limit used: 5. Using Inode routines
[3] MatSeqAIJCheckInode(): Found 37997 nodes of 186063. Limit used: 5. Using Inode routines
[0] MatSeqAIJCheckInode(): Found 37993 nodes of 186062. Limit used: 5. Using Inode routines
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 930314; storage space: 0 unneeded,1011 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[1] MatCheckCompressedRow(): Found the ratio (num_zerorows 185051)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
[1] MatSeqAIJCheckInode(): Found 38413 nodes of 186062. Limit used: 5. Using Inode routines
[5] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 930314; storage space: 0 unneeded,1137 used
[5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[5] MatCheckCompressedRow(): Found the ratio (num_zerorows 184925)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
[5] MatSeqAIJCheckInode(): Found 38565 nodes of 186062. Limit used: 5. Using Inode routines
[3] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 5 Total message length 930328
[4] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 5 Total message length 930328
[2] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 5 Total message length 930327
[1] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 5 Total message length 930329
[0] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 5 Total message length 930329
[5] MatCreateSubMatrices_MPIAIJ_Local(): Number of outgoing messages 5 Total message length 930329
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[5] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[4] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
[5] MatAssemblyEnd_SeqAIJ(): Matrix size: 930314 X 186062; storage space: 0 unneeded,1137 used
[5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[5] MatCheckCompressedRow(): Found the ratio (num_zerorows 929177)/(num_localrows 930314) > 0.6. Use CompressedRow routines.
[5] MatSeqAIJCheckInode(): Found 187410 nodes of 930314. Limit used: 5. Using Inode routines
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 930312 X 186064; storage space: 0 unneeded,685 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[2] MatCheckCompressedRow(): Found the ratio (num_zerorows 929627)/(num_localrows 930312) > 0.6. Use CompressedRow routines.
[2] MatSeqAIJCheckInode(): Found 186870 nodes of 930312. Limit used: 5. Using Inode routines
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 930314 X 186062; storage space: 0 unneeded,1011 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[1] MatCheckCompressedRow(): Found the ratio (num_zerorows 929303)/(num_localrows 930314) > 0.6. Use CompressedRow routines.
[1] MatSeqAIJCheckInode(): Found 187283 nodes of 930314. Limit used: 5. Using Inode routines
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 930313 X 186063; storage space: 0 unneeded,658 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[3] MatCheckCompressedRow(): Found the ratio (num_zerorows 929655)/(num_localrows 930313) > 0.6. Use CompressedRow routines.
[4] MatAssemblyEnd_SeqAIJ(): Matrix size: 930313 X 186063; storage space: 0 unneeded,649 used
[4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[4] MatCheckCompressedRow(): Found the ratio (num_zerorows 929664)/(num_localrows 930313) > 0.6. Use CompressedRow routines.
[3] MatSeqAIJCheckInode(): Found 186842 nodes of 930313. Limit used: 5. Using Inode routines
[4] MatSeqAIJCheckInode(): Found 186825 nodes of 930313. Limit used: 5. Using Inode routines
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 930314 X 186062; storage space: 0 unneeded,648 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 929666)/(num_localrows 930314) > 0.6. Use CompressedRow routines.
[0] MatSeqAIJCheckInode(): Found 186837 nodes of 930314. Limit used: 5. Using Inode routines


More information about the petsc-users mailing list