<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Matt,<br>
      Here is the process as it currently stands:<br>
    <br>
    1) I have a PETSc Vec (sol), which come from a KSPSolve<br>
    <br>
    2) Each processor grabs its section of sol via VecGetOwnershipRange
    and VecGetArrayReadF90<br>
    and inserts parts of its section of sol in a local array (locarr)
    using a complex but easily computable mapping.<br>
    <br>
    3) The routine you are looking at then exchanges various parts of
    the locarr between the processors.<br>
    <br>
    4) Each processor then does computations using its updated locarr.<br>
    <br>
    Typing it out this way, I guess the answer to your question is
    "yes."  I have a global Vec and I want its values<br>
    sent in a complex but computable way to local vectors on each
    process.<br>
    <br>
    -sanjay<br>
    <pre class="moz-signature" cols="72">
</pre>
    <div class="moz-cite-prefix">On 5/31/19 3:37 AM, Matthew Knepley
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAMYG4Gk_eccMW8e2k0DMZTxQcFcU+AqtUmM0UAgnaF=qFGCrdg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div dir="ltr">On Thu, May 30, 2019 at 11:55 PM Sanjay Govindjee
          via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov"
            moz-do-not-send="true">petsc-users@mcs.anl.gov</a>>
          wrote:<br>
        </div>
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div bgcolor="#FFFFFF"> Hi Juanchao,<br>
              Thanks for the hints below, they will take some time to
              absorb as the vectors that are being  moved around<br>
              are actually partly petsc vectors and partly local process
              vectors.<br>
            </div>
          </blockquote>
          <div><br>
          </div>
          <div>Is this code just doing a global-to-local map? Meaning,
            does it just map all the local unknowns to some global</div>
          <div>unknown on some process? We have an even simpler
            interface for that, where we make the VecScatter</div>
          <div>automatically,</div>
          <div><br>
          </div>
          <div>  <a
href="https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html#ISLocalToGlobalMappingCreate"
              moz-do-not-send="true">https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html#ISLocalToGlobalMappingCreate</a></div>
          <div><br>
          </div>
          <div>Then you can use it with Vecs, Mats, etc.</div>
          <div><br>
          </div>
          <div>  Thanks,</div>
          <div><br>
          </div>
          <div>     Matt</div>
          <div> </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div bgcolor="#FFFFFF"> Attached is the modified routine
              that now works (on leaking memory) with openmpi.<br>
              <br>
              -sanjay<br>
              <div class="gmail-m_-6089453002349408992moz-cite-prefix">On
                5/30/19 8:41 PM, Zhang, Junchao wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">
                  <div><br>
                    Hi, Sanjay,</div>
                  <div>  Could you send your modified data exchange code
                    (psetb.F) with MPI_Waitall? See other inlined
                    comments below. Thanks.</div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Thu, May 30,
                      2019 at 1:49 PM Sanjay Govindjee via petsc-users
                      <<a href="mailto:petsc-users@mcs.anl.gov"
                        target="_blank" moz-do-not-send="true">petsc-users@mcs.anl.gov</a>>
                      wrote:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex"> Lawrence,<br>
                      Thanks for taking a look!  This is what I had been
                      wondering about -- my <br>
                      knowledge of MPI is pretty minimal and<br>
                      this origins of the routine were from a programmer
                      we hired a decade+ <br>
                      back from NERSC.  I'll have to look into<br>
                      VecScatter.  It will be great to dispense with our
                      roll-your-own <br>
                      routines (we even have our own reduceALL scattered
                      around the code).<br>
                    </blockquote>
                    <div>Petsc VecScatter has a very simple interface
                      and you definitely should go with.  With
                      VecScatter, you can think in familiar vectors and
                      indices instead of the low level MPI_Send/Recv.
                      Besides that, PETSc has optimized VecScatter so
                      that communication is efficient.<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex"> <br>
                      Interestingly, the MPI_WaitALL has solved the
                      problem when using OpenMPI <br>
                      but it still persists with MPICH.  Graphs
                      attached.<br>
                      I'm going to run with openmpi for now (but I guess
                      I really still need <br>
                      to figure out what is wrong with MPICH and
                      WaitALL;<br>
                      I'll try Barry's suggestion of <br>
--download-mpich-configure-arguments="--enable-error-messages=all <br>
                      --enable-g" later today and report back).<br>
                      <br>
                      Regarding MPI_Barrier, it was put in due a problem
                      that some processes <br>
                      were finishing up sending and receiving and
                      exiting the subroutine<br>
                      before the receiving processes had completed
                      (which resulted in data <br>
                      loss as the buffers are freed after the call to
                      the routine). <br>
                      MPI_Barrier was the solution proposed<br>
                      to us.  I don't think I can dispense with it, but
                      will think about some <br>
                      more.</blockquote>
                    <div>After MPI_Send(), or after MPI_Isend(..,req)
                      and MPI_Wait(req), you can safely free the send
                      buffer without worry that the receive has not
                      completed. MPI guarantees the receiver can get the
                      data, for example, through internal buffering.</div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex"> <br>
                      I'm not so sure about using MPI_IRecv as it will
                      require a bit of <br>
                      rewriting since right now I process the received<br>
                      data sequentially after each blocking MPI_Recv --
                      clearly slower but <br>
                      easier to code.<br>
                      <br>
                      Thanks again for the help.<br>
                      <br>
                      -sanjay<br>
                      <br>
                      On 5/30/19 4:48 AM, Lawrence Mitchell wrote:<br>
                      > Hi Sanjay,<br>
                      ><br>
                      >> On 30 May 2019, at 08:58, Sanjay
                      Govindjee via petsc-users <<a
                        href="mailto:petsc-users@mcs.anl.gov"
                        target="_blank" moz-do-not-send="true">petsc-users@mcs.anl.gov</a>>
                      wrote:<br>
                      >><br>
                      >> The problem seems to persist but with a
                      different signature.  Graphs attached as before.<br>
                      >><br>
                      >> Totals with MPICH (NB: single run)<br>
                      >><br>
                      >> For the CG/Jacobi         
                      data_exchange_total = 41,385,984; kspsolve_total =
                      38,289,408<br>
                      >> For the GMRES/BJACOBI     
                      data_exchange_total = 41,324,544; kspsolve_total =
                      41,324,544<br>
                      >><br>
                      >> Just reading the MPI docs I am wondering
                      if I need some sort of MPI_Wait/MPI_Waitall before
                      my MPI_Barrier in the data exchange routine?<br>
                      >> I would have thought that with the
                      blocking receives and the MPI_Barrier that
                      everything will have fully completed and cleaned
                      up before<br>
                      >> all processes exited the routine, but
                      perhaps I am wrong on that.<br>
                      ><br>
                      > Skimming the fortran code you sent you do:<br>
                      ><br>
                      > for i in ...:<br>
                      >     call MPI_Isend(..., req, ierr)<br>
                      ><br>
                      > for i in ...:<br>
                      >     call MPI_Recv(..., ierr)<br>
                      ><br>
                      > But you never call MPI_Wait on the request
                      you got back from the Isend. So the MPI library
                      will never free the data structures it created.<br>
                      ><br>
                      > The usual pattern for these non-blocking
                      communications is to allocate an array for the
                      requests of length nsend+nrecv and then do:<br>
                      ><br>
                      > for i in nsend:<br>
                      >     call MPI_Isend(..., req[i], ierr)<br>
                      > for j in nrecv:<br>
                      >     call MPI_Irecv(..., req[nsend+j], ierr)<br>
                      ><br>
                      > call MPI_Waitall(req, ..., ierr)<br>
                      ><br>
                      > I note also there's no need for the Barrier
                      at the end of the routine, this kind of
                      communication does neighbourwise synchronisation,
                      no need to add (unnecessary) global
                      synchronisation too.<br>
                      ><br>
                      > As an aside, is there a reason you don't use
                      PETSc's VecScatter to manage this global to local
                      exchange?<br>
                      ><br>
                      > Cheers,<br>
                      ><br>
                      > Lawrence<br>
                      <br>
                    </blockquote>
                  </div>
                </div>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        <div dir="ltr" class="gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div>
                  <div dir="ltr">
                    <div>What most experimenters take for granted before
                      they begin their experiments is infinitely more
                      interesting than any results to which their
                      experiments lead.<br>
                      -- Norbert Wiener</div>
                    <div><br>
                    </div>
                    <div><a href="http://www.cse.buffalo.edu/~knepley/"
                        target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>