[petsc-users] Possibilities to run further computations based on intermediate results of VecScatter

Tue Oct 26 10:02:13 CDT 2021

Hi, Hannes,
  It looks your concern is no enough memory to store vector entries needed
in SpMV (instead of the performance one might gain by doing computation
immediately upon arrival of data from neighbors).  Please note in petsc
SpMV, it does not need to store the whole vector locally, it only needs
some entries (e.g., corresponding to nonzero columns). If storing these
sparse entries causes memory consumption problems for you, then I wonder
how you would store your matrix, which supposedly needs more memory.

With that said, petsc vecscatter does not have something like
VecScatterWaitAny(). For your experiment, you can leverage info provided by
vecscatter, since the communication analysis part is hard.   Let's say you
created VecScatter *sf* to scattering MPI Vec x to sequential Vec y (Note
PetscSF and Vecscatter are the same type)

Call PetscSFGetLeafRanks
<https://petsc.org/release/docs/manualpages/PetscSF/PetscSFGetLeafRanks.html#PetscSFGetLeafRanks>
(PetscSF
<https://petsc.org/release/docs/manualpages/PetscSF/PetscSF.html#PetscSF>
sf,PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
*niranks,const PetscMPIInt
<https://petsc.org/release/docs/manualpages/Sys/PetscMPIInt.html#PetscMPIInt>
**iranks,const PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
**ioffset,const PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
**irootloc) to get send info

niranks: number of MPI ranks to which this rank wants to send entries of x

iranks[]: of length niranks, storing MPI ranks mentioned above

ioffset[]: of length niranks+1. ioffset[] stores indices to irootloc[].

irootloc[]: irootloc[ioffset[i]..ioffset[i+1]] stores (local) indices
of entries of x that should be sent to iranks[i]

Call PetscSFGetRootRanks
<https://petsc.org/release/docs/manualpages/PetscSF/PetscSFGetRootRanks.html#PetscSFGetRootRanks>
(PetscSF
<https://petsc.org/release/docs/manualpages/PetscSF/PetscSF.html#PetscSF>
sf,PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
*nranks,const PetscMPIInt
<https://petsc.org/release/docs/manualpages/Sys/PetscMPIInt.html#PetscMPIInt>
**ranks,const PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
**roffset,const PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
**rmine,const PetscInt
<https://petsc.org/release/docs/manualpages/Sys/PetscInt.html#PetscInt>
**rremote) to get receive info

nranks: number of MPI ranks from which this rank will receive entries of x

ranks[]: of length nranks, storing MPI ranks mentioned above

roffset[]: of length nranks+1. roffset[] stores indices to rmine[]

rmine[]: rmine[roffset[i]..roffset[i+1]] stores (local) indices of
entries of y that should receive data from ranks[i]

Using above info, you can allocate send/recv buffers, post
MPI_Isend/Irecv, and do MPI_Waitany() on MPI_Requests returned from
MPI_Irecv. You can use
PetscCommGetNewTag(PetscObjectComm((PetscObject)sf), &newtag) to get a
good MPI tag for your own MPI_Isend/Irecv.

--Junchao Zhang

On Tue, Oct 26, 2021 at 7:41 AM Hannes Phil Niklas Brandt <
s6hsbran at uni-bonn.de> wrote:

> Hello,
>
> I
> am interested in the non-blocking, collective communication of
> Petsc-Vecs.
> Right
> now I am using VecScatterBegin and
> VecScatterEnd to scatter different entries of a parallel distributed
> MPI-Vec to local sequential vectors on each process.
> After the call to VecScatterEnd I perform
> separate
> computations on each block of the
> sequential Vec
> corresponding to a process.
> However,
> I would prefer to use each block of the local sequential Vec for
> those further
> computations as soon as I receive it from the
> corresponding process (so I do not want to
> wait for the whole scattering
> to finish). Are there functionalities in Petsc capable
> of this?
>
> I
> am trying to compute the matrix-vector-product for a parallel
> distributed MPI-Vec and a parallel distributed sparse matrix format I
> implemented myself. Each process needs entries from the whole MPI-Vec
> for the product, but does not have enough storage capacities to store
> those entries all at once, not even in a sparse format. Therefore, I need
> to
> process the entries in small blocks and add the results onto a local
> result vector.
>
> Best
> Regards
> Hannes
> p { margin-bottom: 0.25cm; line-height: 115%; background: transparent }
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211026/96ba999d/attachment.html>