[petsc-users] Various Questions Regarding PETSC

Matthew Knepley knepley at gmail.com
Fri Jul 12 07:23:19 CDT 2019


On Fri, Jul 12, 2019 at 5:19 AM Mohammed Mostafa via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello all,
> I have a few question regarding Petsc,
>

Please send the entire output of a run with all the logging turned on,
using -log_view and -info.

  Thanks,

    Matt


> Question 1:
> For the profiling , is it possible to only show the user defined log
> events in the breakdown of each stage in Log-view.
> I tried deactivating all ClassIDs, MAT,VEC, PC, KSP,PC,
>  PetscLogEventExcludeClass(MAT_CLASSID);
> PetscLogEventExcludeClass(VEC_CLASSID);
> PetscLogEventExcludeClass(KSP_CLASSID);
> PetscLogEventExcludeClass(PC_CLASSID);
> which should "Deactivates event logging for a PETSc object class in every
> stage" according to the manual.
> however I still see them in the stage breakdown
> --- Event Stage 1: Matrix Construction
>
> BuildTwoSidedF         4 1.0 2.7364e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  18  0  0  0  0     0
> VecSet                 1 1.0 4.5300e-06 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyBegin       2 1.0 2.7344e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  18  0  0  0  0     0
> VecAssemblyEnd         2 1.0 8.3447e-06 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin        2 1.0 7.5102e-05 1.7 0.00e+00 0.0 3.6e+01 2.1e+03
> 0.0e+00  0  0  3  0  0   0  0 50 80  0     0
> VecScatterEnd          2 1.0 3.5286e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       2 1.0 8.8930e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         2 1.0 1.3566e-02 1.1 0.00e+00 0.0 3.6e+01 5.3e+02
> 8.0e+00  0  0  3  0  6  10  0 50 20100     0
> AssembleMats           2 1.0 3.9774e-02 1.7 0.00e+00 0.0 7.2e+01 1.3e+03
> 8.0e+00  0  0  7  0  6  28  0100100100     0  # USER EVENT
> myMatSetValues         2 1.0 2.6931e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  19  0  0  0  0     0   # USER EVENT
> setNativeMat           1 1.0 3.5613e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  24  0  0  0  0     0   # USER EVENT
> setNativeMatII         1 1.0 4.7023e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  28  0  0  0  0     0   # USER EVENT
> callScheme             1 1.0 2.2333e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   2  0  0  0  0     0   # USER EVENT
>
> Also is possible to clear the logs so that I can write a  separate
> profiling output file for each timestep ( since I am solving a transient
> problem and I want to know the change in performance as time goes by )
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Question 2:
> Regarding MatSetValues
> Right now, I writing a finite volume code, due to algorithm requirement I
> have to write the matrix into local native format ( array of arrays) and
> then loop through rows and use MatSetValues to set the elements in "Mat A"
> MatSetValues(A, 1, &row, nj, j_index, coefvalues, INSERT_VALUES);
> but it is very slow and it is killing my performance
> although the matrix was properly set using
> MatCreateAIJ(PETSC_COMM_WORLD, this->local_size, this->local_size,
> PETSC_DETERMINE,
> PETSC_DETERMINE, -1, d_nnz, -1, o_nnz, &A);
> with d_nnz,and  o_nnz properly assigned so no mallocs occur during
> matsetvalues and all inserted values are local so no off-processor values
> So my question is it possible to set multiple rows at once hopefully all,
> I checked the manual and MatSetValues can only set dense matrix block
> because it seems that row by row is expensive
> Or perhaps is it possible to copy all rows to the underlying matrix data,
> as I mentioned all values are local and no off-processor values ( stash is
> 0 )
> [0] VecAssemblyBegin_MPI_BTS(): Stash has 0 entries, uses 0 mallocs.
> [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs.
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [4] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [5] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 186064 X 186064; storage space:
> 0 unneeded,743028 used
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 186062; storage space:
> 0 unneeded,742972 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
> [1] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 186062) < 0.6. Do not use CompressedRow routines.
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
> [2] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 186064) < 0.6. Do not use CompressedRow routines.
> [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 186063; storage space:
> 0 unneeded,743093 used
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 186062; storage space:
> 0 unneeded,743036 used
> [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
> [4] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 186063) < 0.6. Do not use CompressedRow routines.
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 186062; storage space:
> 0 unneeded,742938 used
> [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
> [5] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 186062) < 0.6. Do not use CompressedRow routines.
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 186062) < 0.6. Do not use CompressedRow routines.
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 186063; storage space:
> 0 unneeded,743049 used
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 4
> [3] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 186063) < 0.6. Do not use CompressedRow routines.
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 186064 X 685; storage space: 0
> unneeded,685 used
> [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 649; storage space: 0
> unneeded,649 used
> [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [4] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 185414)/(num_localrows 186063) > 0.6. Use CompressedRow routines.
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [2] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 185379)/(num_localrows 186064) > 0.6. Use CompressedRow routines.
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 1011; storage space: 0
> unneeded,1011 used
> [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 1137; storage space: 0
> unneeded,1137 used
> [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [5] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 184925)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 186063 X 658; storage space: 0
> unneeded,658 used
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 186062 X 648; storage space: 0
> unneeded,648 used
> [1] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 185051)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 185414)/(num_localrows 186062) > 0.6. Use CompressedRow routines.
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [3] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 185405)/(num_localrows 186063) > 0.6. Use CompressedRow routines.
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Question 3:
> If all matrix and vector inserted data are local, what part of the vec/mat
> assembly consumes time because matsetvalues and matassembly consume more
> time than matrix builder
> Also this is not just for the first time MAT_FINAL_ASSEMBLY
>
>
> For context the matrix in the above is nearly 1Mx1M partitioned over six
> processes and it was NOT built using DM
>
> Finally the configure options are:
>
> Configure options:
> PETSC_ARCH=release3 -with-debugging=0 COPTFLAGS="-O3 -march=native
> -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3
> -march=native -mtune=native" --with-cc=mpicc --with-cxx=mpicxx
> --with-fc=mpif90 --download-metis --download-hypre
>
> Sorry for such long question and thanks in advance
> Thanks
> M. Kamra
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190712/7e94f9e4/attachment-0001.html>


More information about the petsc-users mailing list