[petsc-users] Equivalent of all_reduce for sparse matrices

Fri May 9 07:19:24 CDT 2014

2014-05-09 13:14 GMT+0200, Matthew Knepley <knepley at gmail.com>:
> On Fri, May 9, 2014 at 3:15 AM, marco restelli <mrestelli at gmail.com> wrote:
>
>> 2014-05-09 5:47 GMT+0200, Jed Brown <jed at jedbrown.org>:
>> > marco restelli <mrestelli at gmail.com> writes:
>> >> Matt, thanks but this I don't understand. What I want is getting three
>> >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I
>> >> can send them around with MPI.
>> >>
>> >> MatGetSubmatrices would give me some PETSc objects, which I can not
>> >> pass to MPI, right?
>> >
>> > I'm not sure you want this, but you can use MatGetRowIJ and similar to
>> > access the representation you're asking for if you are dead set on
>> > depending on a specific data format rather than using generic
>> > interfaces.
>> >
>>
>> Jed, thank you. This is probably not the PETSc solution, but still it
>> might a solution!
>>
>> I have found this example for MatGetRowIJ:
>>
>
> I really do not think you want to do this. It is complex, fragile and I
> believe the performance
> improvement to be non-existent. You can get the effect you want JUST by
> using one function.
> For example, suppose you want 2 procs to get rows [0,5] and two procs to
> get rows [1,3], then
>
> procs A.B
>
>   MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat)
>
> procs C, D
>
>   MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat)
>
> and its done. No MPI, no extraction which depends on the Mat data
> structure.

Matt, I understand that the idea is to avoid using MPI, but I don't
see how getting a submatrix is related to my problem.

Probably a simpler version of my problem is the following:

one matrix is distributed on procs. 0,1
another matrix is distributed on procs. 2,3

The two matrices have the same size and I want to add them. For the
resulting matrix, I want two copies, one is distributed among 0,1 and
the second one among 2,3.

A possibility that I see now is creating a third matrix, with the same
size, distributed among all the four processors: 0,1,2,3, setting it
to zero and then letting processors 0,1 add their matrix, and also 2,3
add their own. Then I could convert the result into two matrices,
making the two copies that I need.

This works provided that in MatAXPY I can uses matrices distributed on
different processors: given that the function computes

Y = a*X + Y

in my case it would be
Y -> procs. 0,1,2,3
X -> procs. 0,1

Would this work?

Marco