[petsc-users] Efficiency of different choice of local rows for MPIAIJ * MPIDENSE

Ian C. Lin iancclin at umich.edu
Tue Jul 2 07:37:15 CDT 2019


Dear Hong,

Thanks for you suggestion. I have not implemented it yet as the re-distribution might involve some code changes to other part of the code, and I am not sure if that worth it. If the computation cost is mainly dominated by the distribution of the dense matrix and the efficiency won’t gain much, we might just avoid introducing this change. Currently, I printed out the nnzs owned by each processor, and the most one owns 60000 nnzs, and the least own owns 10000 nnzs, where the dimension of the matrix is 350000*350000. Do you have suggestions on the best approach?

Thanks,
Ian

> hong at aspiritech.org 於 2019年7月1日 21:56 寫道:
> 
> Ian:
> PETSc implementation of C = A*B requires C has same row ownership as A.
> I believe the distribution will be dominated by the dense matrices B and C, not sparse matrices A. Have you implemented C = A*B and logged performance?
> Hong
> 
> Hi,
> 
> I am recently trying to do matrix multiplication for C = A*B, where A is a sparse matrix MPIAIJ, C and B are created as dense matrices MPIDENSE.
> 
> In matrix A, the nonzeros are not distributed evenly across the processor, meaning that if using the default setting to let each processor own similar number of rows, the number of nonzeros owned by each processor will be significantly different. So I want to use different number of local rows for each processor. In this case, does the MPIDense matrices B and C need to be in the same row-layout as A?
> 
> I mean, is something like the following is doable (A owns 3 rows and B, C own 2 rows)
> 
> 
>            A                    B          C
> P0  o o o o | o.         o o.       o o
>       o o o o | o          o o        o o
>       o o o o | o.    *.   -----  =   ----
>      ---------------         o o        o o
> P1  o o o o | o          o o        o o
> 
> In this case, the entries can be evenly distributed for B and C thus more memory efficient.
> 
> But I am not sure would this make communication more complicated thus slow down the overall wall time. How would you recommend to do? 
> a) Let rows of A and B be both evenly distributed
> b) Let A have different rows layout, and B, C evenly distributed 
> c) Let A have different rows layout, and B, C follow A
> 
> Or maybe other better way that I did not think about.
> 
> Thanks a lot for your help, 
> Ian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190702/3dd6d6e7/attachment.html>


More information about the petsc-users mailing list