[petsc-users] Various Questions Regarding PETSC
Smith, Barry F.
bsmith at mcs.anl.gov
Thu Jul 18 19:48:12 CDT 2019
1) Could you please send MatSetValues2_MPIAIJ()?
2) What time do you get if you use MatSetValues2_MPIAIJ() with an nrow_buffer = 1 ?
Thanks
Barry
> On Jul 18, 2019, at 2:01 AM, Mohammed Mostafa via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
> Hello everyone,
> Since I already established a baseline to compare the cost of inserting values in PetscMatrix.
> And based on the hint of the number of values inserted in the matrix each time
> 2) Can you tell me how many values are inserted?
> I took a look at the source code for "MatSetValues_MPIAIJ" and found that it seems to be designed for
> FEM assembly of the global matrix from element matrices(from what I remember from undergrad) since it requires for the setting of multiple rows that the col indices should be the same
>
> So to increase the number of inserted values I need to modify the implementation of MatSetValues_MPIAIJ to allow for more values to be inserted
> So I made a copy of the function "MatSetValues_MPIAIJ" in "src/mat/impls/aij/mpi/mpiaij.c" and named it "MatSetValues2_MPIAIJ"
> I made some minor changes to allow for inserting multiple rows regardless of whether they have the same col indices
>
> So what I do now is store the data for multiple rows and then insert them all together and I figured I would see how the performance would be.
> I tried different number of rows to be inserted i.e. nrow_buffer = [2, 5, 10, 20, 50, 100]
> So now instead of calling "MatSetValues" for every row in the matrix , I call "MatSetValues2_MPIAIJ" every (nrow_buffer)th which should allow for some performance improvement
> the results are as follows
> First Remember that before
> 1-computation and insertion into petsc matrix
> FillPetscMat_with_MatSetValues 100 1.0 3.8820e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23 0 0 0 0 96 0 0 0 0 0
> 2-computation and insertion into eigen matrix
> FilEigenMat 100 1.0 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18 0 0 0 0 88 0 0 0 0 0
>
> Now
> nrow_buffer = 2
> FillPetscMat_with_MatSetValues2 100 1.0 3.3321e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 20 0 0 0 0 95 0 0 0 0 0
> nrow_buffer = 5
> FillPetscMat_with_MatSetValues2 100 1.0 2.8842e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17 0 0 0 0 94 0 0 0 0 0
> nrow_buffer = 10
> FillPetscMat_with_MatSetValues2 100 1.0 2.7669e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17 0 0 0 0 93 0 0 0 0 0
> nrow_buffer = 20
> FillPetscMat_with_MatSetValues2 100 1.0 2.6834e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 93 0 0 0 0 0
> nrow_buffer = 50
> FillPetscMat_with_MatSetValues2 100 1.0 2.6862e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17 0 0 0 0 93 0 0 0 0 0
> nrow_buffer = 100
> FillPetscMat_with_MatSetValues2 100 1.0 2.6170e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 93 0 0 0 0 0
>
> As to be expected, with increasing the number of rows to be inserted the overhead reduces until it basically stagnates somewhere between 20-50
> The modifications I made based on MatSetValues_MPIAIJ are very small but the effect is significant ( drop in insertion cost by 33%) and it is now even faster than Eigen(baseline) at insertion with my naive usage.
> For now I am quite satisfied with the outcome. There is probably some room for improvement but for now this is enough.
>
> Thanks,
> Kamra
>
> On Thu, Jul 18, 2019 at 12:34 AM Mohammed Mostafa <mo7ammedmostafa at gmail.com> wrote:
> Regarding the first point
> 1) Are you timing only the insertion of values, or computation and insertion?
> I am timing both, the computation and insertion of values but as I said I timed three scenarios
> 1-computation only and no insertion
> Computation_no_insertion 100 1.0 1.6747e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 22 0 0 0 0 0
> 2-computation and insertion into petsc matrix
> FillPetscMat_with_MatSetValues 100 1.0 3.8820e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23 0 0 0 0 96 0 0 0 0 0
> 3-computation and insertion into eigen matrix
> FilEigenMat 100 1.0 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18 0 0 0 0 88 0 0 0 0 0
> I timed 100 times to get a reasonably accurate timings
>
> as for the second point
> 2) Can you tell me how many values are inserted?
> For a total of nearly 186062 rows per process (with 6 processes in total, the matrix global size is 1116376)
> In most rows ( about 99.35%) 4 non-zeros per rows and in the remaining 0.35% 2 or 3 non-zeros per row
> the number of off-diagonal onnz in total is 648 nnz
> So I insert nearly 4 values 186062 times ~= 744248 times per mpi process
>
>
> Thanks,
> Kamra
>
> On Wed, Jul 17, 2019 at 11:59 PM Matthew Knepley <knepley at gmail.com> wrote:
> On Wed, Jul 17, 2019 at 8:51 AM Mohammed Mostafa <mo7ammedmostafa at gmail.com> wrote:
> Sorry for the confusion
> First I fully acknowledge that setting Matrix non-zeros or copying in general is not cheap and memory access pattern can play an important role.
> So to establish a baseline to compare with, I tried setting the same matrix but in an Eigen Sparse Matrix and the timings are as follows
> FillPetscMat_with_MatSetValues 100 1.0 3.8820e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23 0 0 0 0 96 0 0 0 0 0
> FilEigenMat 100 1.0 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18 0 0 0 0 88 0 0 0 0 0
>
> Great. This helps. Two things would help me narrow down what is happening.
>
> 1) Are you timing only the insertion of values, or computation and insertion?
>
> 2) Can you tell me how many values are inserted?
>
> Thanks,
>
> Matt
>
> I used the same code but simply filled a different Matrix something like
>
> for ( i =0; i < nRows;i++)
> {
> //
> .......
> // Some code to get j_index, coefvalues
> // Method1
> MatSetValues(A, 1, &cell_global_index, nnz_per_row, j_index, coefvalues, INSERT_VALUES);
>
> //Method2
> for ( int k = 0;k < nnz_per_row; k++)
> EigenMat.coeffRef(i, j_index[k] ) = coefvalues[k];
>
> }
> Please note that only one of the two methods is being used at a time. Also, I separately time the code section used to < j_index, coefvalues> but simpling disabling both Method1 and Method2.
> I found the cost to be trivial in comparison to when either one of the methods is used.
> I used Eigen out of convenience since I used for some vector and tensor arithmetics somewhere else in the code and it may not be the best choice.
> Since in PetscMatrix we technically fill two matrices: diagonal and off-diagonal so I expected some difference but is that normal or am I missing something. ?
> Maybe some setting or MatOption I should be using so far this what I have been using
>
> MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, local_size, local_size, PETSC_DETERMINE,
> PETSC_DETERMINE, ptr, j , v, A);
> MatSetOption(A,MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE);
> MatSetOption(A,MAT_IGNORE_OFF_PROC_ENTRIES,PETSC_TRUE);
> MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE);
> MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE);
> MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE);
> MatSetOption(A,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE);
> MatSetUp(A);
> MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
> MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>
> Thanks,
> Kamra
>
>
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
More information about the petsc-users
mailing list