[petsc-users] Various Questions Regarding PETSC

Mohammed Mostafa mo7ammedmostafa at gmail.com
Fri Jul 12 05:12:35 CDT 2019


Hello all,
I have a few question regarding Petsc,


Question 1:
For the profiling , is it possible to only show the user defined log events
in the breakdown of each stage in Log-view.
I tried deactivating all ClassIDs, MAT,VEC, PC, KSP,PC,
 PetscLogEventExcludeClass(MAT_CLASSID);
PetscLogEventExcludeClass(VEC_CLASSID);
PetscLogEventExcludeClass(KSP_CLASSID);
PetscLogEventExcludeClass(PC_CLASSID);
which should "Deactivates event logging for a PETSc object class in every
stage" according to the manual.
however I still see them in the stage breakdown
--- Event Stage 1: Matrix Construction

BuildTwoSidedF         4 1.0 2.7364e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0  18  0  0  0  0     0
VecSet                 1 1.0 4.5300e-06 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyBegin       2 1.0 2.7344e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0  18  0  0  0  0     0
VecAssemblyEnd         2 1.0 8.3447e-06 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        2 1.0 7.5102e-05 1.7 0.00e+00 0.0 3.6e+01 2.1e+03
0.0e+00  0  0  3  0  0   0  0 50 80  0     0
VecScatterEnd          2 1.0 3.5286e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       2 1.0 8.8930e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 1.3566e-02 1.1 0.00e+00 0.0 3.6e+01 5.3e+02
8.0e+00  0  0  3  0  6  10  0 50 20100     0
AssembleMats           2 1.0 3.9774e-02 1.7 0.00e+00 0.0 7.2e+01 1.3e+03
8.0e+00  0  0  7  0  6  28  0100100100     0  # USER EVENT
myMatSetValues         2 1.0 2.6931e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0  19  0  0  0  0     0   # USER EVENT
setNativeMat           1 1.0 3.5613e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0  24  0  0  0  0     0   # USER EVENT
setNativeMatII         1 1.0 4.7023e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0  28  0  0  0  0     0   # USER EVENT
callScheme             1 1.0 2.2333e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   2  0  0  0  0     0   # USER EVENT

Also is possible to clear the logs so that I can write a  separate
profiling output file for each timestep ( since I am solving a transient
problem and I want to know the change in performance as time goes by )
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Question 2:
Regarding MatSetValues
Right now, I writing a finite volume code, due to algorithm requirement I
have to write the matrix into local native format ( array of arrays) and
then loop through rows and use MatSetValues to set the elements in "Mat A"
MatSetValues(A, 1, &row, nj, j_index, coefvalues, INSERT_VALUES);
but it is very slow and it is killing my performance
although the matrix was properly set using
MatCreateAIJ(PETSC_COMM_WORLD, this->local_size, this->local_size,
PETSC_DETERMINE,
PETSC_DETERMINE, -1, d_nnz, -1, o_nnz, &A);
with d_nnz,and  o_nnz properly assigned so no mallocs occur during
matsetvalues and all inserted values are local so no off-processor values
So my question is it possible to set multiple rows at once hopefully all, I
checked the manual and MatSetValues can only set dense matrix block because
it seems that row by row is expensive
Or perhaps is it possible to copy all rows to the underlying matrix data,
as I mentioned all values are local and no off-processor values ( stash is
0 )
[0] VecAssemblyBegin_MPI_BTS(): Stash has 0 entries, uses 0 mallocs.
[0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs.
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[4] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[5] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 186064 X 186064; storage space: 0
unneeded,743028 used
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 186062; storage space: 0
unneeded,742972 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
[1] MatCheckCompressedRow(): Found the ratio (num_zerorows
0)/(num_localrows 186062) < 0.6. Do not use CompressedRow routines.
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
[2] MatCheckCompressedRow(): Found the ratio (num_zerorows
0)/(num_localrows 186064) < 0.6. Do not use CompressedRow routines.
[4] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 186063; storage space: 0
unneeded,743093 used
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 186062; storage space: 0
unneeded,743036 used
[4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
[4] MatCheckCompressedRow(): Found the ratio (num_zerorows
0)/(num_localrows 186063) < 0.6. Do not use CompressedRow routines.
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[5] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 186062; storage space: 0
unneeded,742938 used
[5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
[5] MatCheckCompressedRow(): Found the ratio (num_zerorows
0)/(num_localrows 186062) < 0.6. Do not use CompressedRow routines.
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows
0)/(num_localrows 186062) < 0.6. Do not use CompressedRow routines.
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 186063; storage space: 0
unneeded,743049 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
[3] MatCheckCompressedRow(): Found the ratio (num_zerorows
0)/(num_localrows 186063) < 0.6. Do not use CompressedRow routines.
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 186064 X 685; storage space: 0
unneeded,685 used
[4] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 649; storage space: 0
unneeded,649 used
[4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[4] MatCheckCompressedRow(): Found the ratio (num_zerorows
185414)/(num_localrows 186063) > 0.6. Use CompressedRow routines.
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[2] MatCheckCompressedRow(): Found the ratio (num_zerorows
185379)/(num_localrows 186064) > 0.6. Use CompressedRow routines.
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 1011; storage space: 0
unneeded,1011 used
[5] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 1137; storage space: 0
unneeded,1137 used
[5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[5] MatCheckCompressedRow(): Found the ratio (num_zerorows
184925)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 658; storage space: 0
unneeded,658 used
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 648; storage space: 0
unneeded,648 used
[1] MatCheckCompressedRow(): Found the ratio (num_zerorows
185051)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows
185414)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[3] MatCheckCompressedRow(): Found the ratio (num_zerorows
185405)/(num_localrows 186063) > 0.6. Use CompressedRow routines.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Question 3:
If all matrix and vector inserted data are local, what part of the vec/mat
assembly consumes time because matsetvalues and matassembly consume more
time than matrix builder
Also this is not just for the first time MAT_FINAL_ASSEMBLY


For context the matrix in the above is nearly 1Mx1M partitioned over six
processes and it was NOT built using DM

Finally the configure options are:

Configure options:
PETSC_ARCH=release3 -with-debugging=0 COPTFLAGS="-O3 -march=native
-mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3
-march=native -mtune=native" --with-cc=mpicc --with-cxx=mpicxx
--with-fc=mpif90 --download-metis --download-hypre

Sorry for such long question and thanks in advance
Thanks
M. Kamra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190712/620bc44f/attachment.html>


More information about the petsc-users mailing list