[petsc-users] Scattering a vector to/from a subset of processors
Junchao Zhang
junchao.zhang at gmail.com
Wed Dec 6 11:12:04 CST 2023
Hi, Sreeram,
I made an example with your approach. It worked fine as you see the
output at the end
#include <petscvec.h>
int main(int argc, char **argv)
{
PetscInt i, j, rstart, rend, n, N, *indices;
PetscMPIInt size, rank;
IS ix;
VecScatter vscat;
Vec x, y;
PetscFunctionBeginUser;
PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL));
PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
PetscCall(VecCreate(PETSC_COMM_WORLD, &x));
PetscCall(VecSetFromOptions(x));
PetscCall(PetscObjectSetName((PetscObject)x, "Vec X"));
n = (rank < 4) ? 9 : 0;
PetscCall(VecSetSizes(x, n, PETSC_DECIDE));
PetscCall(VecGetOwnershipRange(x, &rstart, &rend));
for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i, (PetscScalar)i,
INSERT_VALUES));
PetscCall(VecAssemblyBegin(x));
PetscCall(VecAssemblyEnd(x));
PetscCall(VecGetSize(x, &N));
PetscCall(VecCreate(PETSC_COMM_WORLD, &y));
PetscCall(VecSetFromOptions(y));
PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y"));
PetscCall(VecSetSizes(y, PETSC_DECIDE, N));
PetscCall(VecGetOwnershipRange(y, &rstart, &rend));
PetscCall(PetscMalloc1(rend - rstart, &indices));
for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j;
PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices,
PETSC_OWN_POINTER, &ix));
PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat));
PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD));
PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD));
PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD));
PetscCall(VecScatterDestroy(&vscat));
PetscCall(ISDestroy(&ix));
PetscCall(VecDestroy(&x));
PetscCall(VecDestroy(&y));
PetscCall(PetscFinalize());
return 0;
}
$ mpirun -n 12 ./ex100
IS Object: 12 MPI processes
type: general
[0] Number of indices in set 3
[0] 0 0
[0] 1 12
[0] 2 24
[1] Number of indices in set 3
[1] 0 1
[1] 1 13
[1] 2 25
[2] Number of indices in set 3
[2] 0 2
[2] 1 14
[2] 2 26
[3] Number of indices in set 3
[3] 0 3
[3] 1 15
[3] 2 27
[4] Number of indices in set 3
[4] 0 4
[4] 1 16
[4] 2 28
[5] Number of indices in set 3
[5] 0 5
[5] 1 17
[5] 2 29
[6] Number of indices in set 3
[6] 0 6
[6] 1 18
[6] 2 30
[7] Number of indices in set 3
[7] 0 7
[7] 1 19
[7] 2 31
[8] Number of indices in set 3
[8] 0 8
[8] 1 20
[8] 2 32
[9] Number of indices in set 3
[9] 0 9
[9] 1 21
[9] 2 33
[10] Number of indices in set 3
[10] 0 10
[10] 1 22
[10] 2 34
[11] Number of indices in set 3
[11] 0 11
[11] 1 23
[11] 2 35
Vec Object: Vec X 12 MPI processes
type: mpi
Process [0]
0.
1.
2.
3.
4.
5.
6.
7.
8.
Process [1]
9.
10.
11.
12.
13.
14.
15.
16.
17.
Process [2]
18.
19.
20.
21.
22.
23.
24.
25.
26.
Process [3]
27.
28.
29.
30.
31.
32.
33.
34.
35.
Process [4]
Process [5]
Process [6]
Process [7]
Process [8]
Process [9]
Process [10]
Process [11]
Vec Object: Vec Y 12 MPI processes
type: mpi
Process [0]
0.
12.
24.
Process [1]
1.
13.
25.
Process [2]
2.
14.
26.
Process [3]
3.
15.
27.
Process [4]
4.
16.
28.
Process [5]
5.
17.
29.
Process [6]
6.
18.
30.
Process [7]
7.
19.
31.
Process [8]
8.
20.
32.
Process [9]
9.
21.
33.
Process [10]
10.
22.
34.
Process [11]
11.
23.
35.
--Junchao Zhang
On Tue, Dec 5, 2023 at 10:09 PM Sreeram R Venkat <srvenkat at utexas.edu>
wrote:
> Yes, I have an example code at github.com/s769/petsc-test. Only thing is,
> when I described the example before, I simplified the actual use case in
> the code to make things simpler. Here are the extra details relevant to
> this code:
>
> - We assume a 2D processor grid, given by the command-line args
> -proc_rows and -proc_cols
> - The total length of the vector is n_m*n_t given by command-line args
> -nm and -nt. n_m corresponds to a space index and n_t a time index.
> - In the "Start" phase, the vector is divided into n_m blocks each of
> size n_t (indexed space->time). The blocks are partitioned over the first
> row of processors. For example if -nm = 4 and -proc_cols = 4, each
> processor in the first row will get one block of size n_t. Each processor
> in the first row gets n_m_local blocks of size n_t, where the sum of all
> n_m_locals for the first row of processors is n_m.
> - In the "Finish" phase, the vector is divided into n_t blocks each of
> size n_m (indexed time->space; this is the reason for the permutation of
> indices). The blocks are partitioned over all processors. Each processor
> will get n_t_local blocks of size n_m, where the sum of all n_t_locals for
> all processors is n_t.
>
> I think the basic idea is similar to the previous example, but these
> details make things a bit more complicated. Please let me know if anything
> is unclear, and I can try to explain more.
>
> Thanks for your help,
> Sreeram
>
> On Tue, Dec 5, 2023 at 9:30 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> I think your approach is correct. Do you have an example code?
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Dec 5, 2023 at 5:15 PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Hi, I have a follow up question on this.
>>>
>>> Now, I'm trying to do a scatter and permutation of the vector. Under the
>>> same setup as the original example, here are the new Start and Finish
>>> states I want to achieve:
>>> Start Finish
>>> Proc | Entries Proc | Entries
>>> 0 | 0,...,8 0 | 0, 12, 24
>>> 1 | 9,...,17 1 | 1, 13, 25
>>> 2 | 18,...,26 2 | 2, 14, 26
>>> 3 | 27,...,35 3 | 3, 15, 27
>>> 4 | None 4 | 4, 16, 28
>>> 5 | None 5 | 5, 17, 29
>>> 6 | None 6 | 6, 18, 30
>>> 7 | None 7 | 7, 19, 31
>>> 8 | None 8 | 8, 20, 32
>>> 9 | None 9 | 9, 21, 33
>>> 10 | None 10 | 10, 22, 34
>>> 11 | None 11 | 11, 23, 35
>>>
>>> So far, I've tried to use ISCreateGeneral(), with each process giving an
>>> idx array corresponding to the indices it wants (i.e. idx on P0 looks like
>>> [0,12,24] P1 -> [1,13, 25], and so on).
>>> Then I use this to create the VecScatter with VecScatterCreate(x, is, y,
>>> NULL, &scatter).
>>>
>>> However, when I try to do the scatter, I get some illegal memory access
>>> errors.
>>>
>>> Is there something wrong with how I define the index sets?
>>>
>>> Thanks,
>>> Sreeram
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Oct 5, 2023 at 12:57 PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>>> Thank you. This works for me.
>>>>
>>>> Sreeram
>>>>
>>>> On Wed, Oct 4, 2023 at 6:41 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, Sreeram,
>>>>> You can try this code. Since x, y are both MPI vectors, we just need
>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets
>>>>> [0..17], [18..35], .., []. PETSc will figure out how to do the
>>>>> communication.
>>>>>
>>>>> PetscInt rstart, rend, N;
>>>>> IS ix;
>>>>> VecScatter vscat;
>>>>> Vec y;
>>>>> MPI_Comm comm;
>>>>> VecType type;
>>>>>
>>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>>> VecGetType(x, &type);
>>>>> VecGetSize(x, &N);
>>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>>
>>>>> VecCreate(comm, &y);
>>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>>> VecSetType(y, type);
>>>>>
>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart,
>>>>> 1, &ix);
>>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>>
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Wed, Oct 4, 2023 at 6:03 PM Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of
>>>>>> size 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so
>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, 3,
>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>>>> what IndexSets to use for the sender and receiver.
>>>>>>
>>>>>> The result I am trying to achieve is this:
>>>>>>
>>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>>
>>>>>> Start Finish
>>>>>> Proc | Entries Proc | Entries
>>>>>> 0 | 0,...,8 0 | 0, 1, 2
>>>>>> 1 | 9,...,17 1 | 3, 4, 5
>>>>>> 2 | 18,...,26 2 | 6, 7, 8
>>>>>> 3 | 27,...,35 3 | 9, 10, 11
>>>>>> 4 | None 4 | 12, 13, 14
>>>>>> 5 | None 5 | 15, 16, 17
>>>>>> 6 | None 6 | 18, 19, 20
>>>>>> 7 | None 7 | 21, 22, 23
>>>>>> 8 | None 8 | 24, 25, 26
>>>>>> 9 | None 9 | 27, 28, 29
>>>>>> 10 | None 10 | 30, 31, 32
>>>>>> 11 | None 11 | 33, 34, 35
>>>>>>
>>>>>> Appreciate any help you can provide on this.
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231206/6ec83f20/attachment-0001.html>
More information about the petsc-users
mailing list