[petsc-users] Various Questions Regarding PETSC

Smith, Barry F. bsmith at mcs.anl.gov
Thu Jul 18 19:48:12 CDT 2019


 
  1) Could you please send MatSetValues2_MPIAIJ()?

   2) What time do you get if you use MatSetValues2_MPIAIJ() with an nrow_buffer = 1 ?

   Thanks

   Barry


> On Jul 18, 2019, at 2:01 AM, Mohammed Mostafa via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello everyone,
> Since I already established a baseline to compare the cost of inserting values in PetscMatrix.
> And based on the hint of the number of values inserted in the matrix each time
>  2) Can you tell me how many values are inserted? 
> I took a look at the source code for "MatSetValues_MPIAIJ" and found that it seems to be designed for 
> FEM assembly of the global matrix from element matrices(from what I remember from undergrad) since it requires for the setting of multiple rows that the col indices should be the same
> 
> So to increase the number of inserted values I need to modify the implementation of  MatSetValues_MPIAIJ to allow for more values to be inserted 
> So I made a copy of the function "MatSetValues_MPIAIJ" in "src/mat/impls/aij/mpi/mpiaij.c" and named it "MatSetValues2_MPIAIJ"
> I made some minor changes to allow for inserting multiple rows regardless of whether they have the same col indices
> 
> So what I do now is store the data for multiple rows and then insert them all together and I figured I would see how the performance would be.
> I tried different number of rows to be inserted i.e. nrow_buffer = [2, 5, 10, 20, 50, 100]
> So now instead of calling   "MatSetValues" for every row in the matrix , I call "MatSetValues2_MPIAIJ" every  (nrow_buffer)th which should allow for some performance improvement
> the results are as follows
> First Remember that before
> 1-computation and insertion into petsc matrix
> FillPetscMat_with_MatSetValues              100 1.0 3.8820e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  96  0  0  0  0     0 
> 2-computation and insertion into eigen matrix 
> FilEigenMat                                               100 1.0 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  88  0  0  0  0     0    
> 
> Now
> nrow_buffer = 2
> FillPetscMat_with_MatSetValues2                  100 1.0 3.3321e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 20  0  0  0  0  95  0  0  0  0     0
> nrow_buffer = 5
> FillPetscMat_with_MatSetValues2                  100 1.0 2.8842e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  94  0  0  0  0     0
> nrow_buffer = 10
> FillPetscMat_with_MatSetValues2                  100 1.0 2.7669e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  93  0  0  0  0     0
> nrow_buffer = 20
> FillPetscMat_with_MatSetValues2                  100 1.0 2.6834e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16  0  0  0  0  93  0  0  0  0     0
> nrow_buffer = 50
> FillPetscMat_with_MatSetValues2                  100 1.0 2.6862e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  93  0  0  0  0     0
> nrow_buffer = 100
> FillPetscMat_with_MatSetValues2                  100 1.0 2.6170e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16  0  0  0  0  93  0  0  0  0     0
> 
> As to be expected, with increasing the number of rows to be inserted the overhead reduces until it basically stagnates somewhere between 20-50 
> The modifications I made based on MatSetValues_MPIAIJ are very small but the effect is significant ( drop in insertion cost by 33%) and it is now even faster than Eigen(baseline) at insertion with my naive usage.
> For now I am quite satisfied with the outcome. There is probably some room for improvement but for now this is enough.
>  
> Thanks,
> Kamra
> 
> On Thu, Jul 18, 2019 at 12:34 AM Mohammed Mostafa <mo7ammedmostafa at gmail.com> wrote:
> Regarding the first point
> 1) Are you timing only the insertion of values, or computation and insertion?  
> I am timing both, the computation and insertion of values but as I said I timed three scenarios
> 1-computation only and no insertion
> Computation_no_insertion                            100 1.0 1.6747e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0  22  0  0  0  0     0
> 2-computation and insertion into petsc matrix
> FillPetscMat_with_MatSetValues              100 1.0 3.8820e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  96  0  0  0  0     0 
> 3-computation and insertion into eigen matrix 
> FilEigenMat                                               100 1.0 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  88  0  0  0  0     0    
> I timed 100 times to get a reasonably accurate timings
> 
> as for the second point
>  2) Can you tell me how many values are inserted? 
> For a total of nearly 186062 rows per process (with  6 processes in total, the matrix global size is 1116376)
> In most rows ( about 99.35%)  4 non-zeros per rows and in the remaining 0.35% 2 or 3 non-zeros per row
> the number of off-diagonal onnz in total is 648 nnz 
> So I insert nearly 4 values 186062 times ~= 744248 times per mpi process
> 
> 
> Thanks,
> Kamra
> 
> On Wed, Jul 17, 2019 at 11:59 PM Matthew Knepley <knepley at gmail.com> wrote:
> On Wed, Jul 17, 2019 at 8:51 AM Mohammed Mostafa <mo7ammedmostafa at gmail.com> wrote:
> Sorry for the confusion 
> First I fully acknowledge that setting Matrix non-zeros or copying in general is not cheap and memory access pattern can play an important role.
> So to establish a baseline to compare with, I tried setting the same matrix but in an Eigen Sparse Matrix  and the timings are as follows
> FillPetscMat_with_MatSetValues              100 1.0 3.8820e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  96  0  0  0  0     0 
> FilEigenMat                                               100 1.0 2.8727e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  88  0  0  0  0     0  
> 
> Great. This helps. Two things would help me narrow down what is happening.
> 
>   1) Are you timing only the insertion of values, or computation and insertion?
> 
>   2) Can you tell me how many values are inserted?
> 
>   Thanks,
> 
>     Matt
>  
> I used the same code but simply filled a different Matrix something like
> 
> for ( i =0; i < nRows;i++)
> {
> //
> .......
> // Some code to get  j_index, coefvalues
> // Method1
> MatSetValues(A, 1, &cell_global_index, nnz_per_row, j_index, coefvalues, INSERT_VALUES); 
> 
> //Method2
> for ( int k = 0;k < nnz_per_row; k++)
>      EigenMat.coeffRef(i, j_index[k] ) = coefvalues[k];
> 
>  }
> Please note that only one of the two methods is being used at a time. Also, I separately time the code section used to <  j_index,   coefvalues> but simpling disabling both Method1 and Method2.
> I found the cost to be trivial in comparison to when either one of the methods is used.
> I used Eigen out of convenience since I used for some vector and tensor arithmetics somewhere else in the code and it may not be the best choice.
> Since in PetscMatrix we technically fill two matrices: diagonal and off-diagonal so I expected some difference but is that normal or am I missing something. ?
> Maybe some setting or MatOption I should be using so far this what I have been using
> 
> MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, local_size, local_size, PETSC_DETERMINE,
> PETSC_DETERMINE, ptr, j , v, A);
>  MatSetOption(A,MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE);
> MatSetOption(A,MAT_IGNORE_OFF_PROC_ENTRIES,PETSC_TRUE);
> MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE);
> MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE);
> MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE);
> MatSetOption(A,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE);
> MatSetUp(A);
> MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
> MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);  
> 
> Thanks,
> Kamra
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/



More information about the petsc-users mailing list