<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On May 31, 2019, at 9:50 PM, Sanjay Govindjee via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" class="">petsc-users@mcs.anl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
  
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
  
  <div text="#000000" bgcolor="#FFFFFF" class="">
    Matt,<br class="">
      Here is the process as it currently stands:<br class="">
    <br class="">
    1) I have a PETSc Vec (sol), which come from a KSPSolve<br class="">
    <br class="">
    2) Each processor grabs its section of sol via VecGetOwnershipRange
    and VecGetArrayReadF90<br class="">
    and inserts parts of its section of sol in a local array (locarr)
    using a complex but easily computable mapping.<br class="">
    <br class="">
    3) The routine you are looking at then exchanges various parts of
    the locarr between the processors.<br class="">
    <br class=""></div></div></blockquote><div><br class=""></div><div>You need a VecScatter object <a href="https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate" class="">https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate</a> </div><br class=""><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class="">
    4) Each processor then does computations using its updated locarr.<br class="">
    <br class="">
    Typing it out this way, I guess the answer to your question is
    "yes."  I have a global Vec and I want its values<br class="">
    sent in a complex but computable way to local vectors on each
    process.<br class="">
    <br class="">
    -sanjay<br class="">
    <pre class="moz-signature" cols="72"></pre>
    <div class="moz-cite-prefix">On 5/31/19 3:37 AM, Matthew Knepley
      wrote:<br class="">
    </div>
    <blockquote type="cite" cite="mid:CAMYG4Gk_eccMW8e2k0DMZTxQcFcU+AqtUmM0UAgnaF=qFGCrdg@mail.gmail.com" class="">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8" class="">
      <div dir="ltr" class="">
        <div dir="ltr" class="">On Thu, May 30, 2019 at 11:55 PM Sanjay Govindjee
          via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" moz-do-not-send="true" class="">petsc-users@mcs.anl.gov</a>>
          wrote:<br class="">
        </div>
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div bgcolor="#FFFFFF" class=""> Hi Juanchao,<br class="">
              Thanks for the hints below, they will take some time to
              absorb as the vectors that are being  moved around<br class="">
              are actually partly petsc vectors and partly local process
              vectors.<br class="">
            </div>
          </blockquote>
          <div class=""><br class="">
          </div>
          <div class="">Is this code just doing a global-to-local map? Meaning,
            does it just map all the local unknowns to some global</div>
          <div class="">unknown on some process? We have an even simpler
            interface for that, where we make the VecScatter</div>
          <div class="">automatically,</div>
          <div class=""><br class="">
          </div>
          <div class="">  <a href="https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html#ISLocalToGlobalMappingCreate" moz-do-not-send="true" class="">https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html#ISLocalToGlobalMappingCreate</a></div>
          <div class=""><br class="">
          </div>
          <div class="">Then you can use it with Vecs, Mats, etc.</div>
          <div class=""><br class="">
          </div>
          <div class="">  Thanks,</div>
          <div class=""><br class="">
          </div>
          <div class="">     Matt</div>
          <div class=""> </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div bgcolor="#FFFFFF" class=""> Attached is the modified routine
              that now works (on leaking memory) with openmpi.<br class="">
              <br class="">
              -sanjay<br class="">
              <div class="gmail-m_-6089453002349408992moz-cite-prefix">On
                5/30/19 8:41 PM, Zhang, Junchao wrote:<br class="">
              </div>
              <blockquote type="cite" class="">
                <div dir="ltr" class="">
                  <div class=""><br class="">
                    Hi, Sanjay,</div>
                  <div class="">  Could you send your modified data exchange code
                    (psetb.F) with MPI_Waitall? See other inlined
                    comments below. Thanks.</div>
                  <br class="">
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Thu, May 30,
                      2019 at 1:49 PM Sanjay Govindjee via petsc-users
                      <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank" moz-do-not-send="true" class="">petsc-users@mcs.anl.gov</a>>
                      wrote:<br class="">
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex"> Lawrence,<br class="">
                      Thanks for taking a look!  This is what I had been
                      wondering about -- my <br class="">
                      knowledge of MPI is pretty minimal and<br class="">
                      this origins of the routine were from a programmer
                      we hired a decade+ <br class="">
                      back from NERSC.  I'll have to look into<br class="">
                      VecScatter.  It will be great to dispense with our
                      roll-your-own <br class="">
                      routines (we even have our own reduceALL scattered
                      around the code).<br class="">
                    </blockquote>
                    <div class="">Petsc VecScatter has a very simple interface
                      and you definitely should go with.  With
                      VecScatter, you can think in familiar vectors and
                      indices instead of the low level MPI_Send/Recv.
                      Besides that, PETSc has optimized VecScatter so
                      that communication is efficient.<br class="">
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex"> <br class="">
                      Interestingly, the MPI_WaitALL has solved the
                      problem when using OpenMPI <br class="">
                      but it still persists with MPICH.  Graphs
                      attached.<br class="">
                      I'm going to run with openmpi for now (but I guess
                      I really still need <br class="">
                      to figure out what is wrong with MPICH and
                      WaitALL;<br class="">
                      I'll try Barry's suggestion of <br class="">
--download-mpich-configure-arguments="--enable-error-messages=all <br class="">
                      --enable-g" later today and report back).<br class="">
                      <br class="">
                      Regarding MPI_Barrier, it was put in due a problem
                      that some processes <br class="">
                      were finishing up sending and receiving and
                      exiting the subroutine<br class="">
                      before the receiving processes had completed
                      (which resulted in data <br class="">
                      loss as the buffers are freed after the call to
                      the routine). <br class="">
                      MPI_Barrier was the solution proposed<br class="">
                      to us.  I don't think I can dispense with it, but
                      will think about some <br class="">
                      more.</blockquote>
                    <div class="">After MPI_Send(), or after MPI_Isend(..,req)
                      and MPI_Wait(req), you can safely free the send
                      buffer without worry that the receive has not
                      completed. MPI guarantees the receiver can get the
                      data, for example, through internal buffering.</div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex"> <br class="">
                      I'm not so sure about using MPI_IRecv as it will
                      require a bit of <br class="">
                      rewriting since right now I process the received<br class="">
                      data sequentially after each blocking MPI_Recv --
                      clearly slower but <br class="">
                      easier to code.<br class="">
                      <br class="">
                      Thanks again for the help.<br class="">
                      <br class="">
                      -sanjay<br class="">
                      <br class="">
                      On 5/30/19 4:48 AM, Lawrence Mitchell wrote:<br class="">
                      > Hi Sanjay,<br class="">
                      ><br class="">
                      >> On 30 May 2019, at 08:58, Sanjay
                      Govindjee via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank" moz-do-not-send="true" class="">petsc-users@mcs.anl.gov</a>>
                      wrote:<br class="">
                      >><br class="">
                      >> The problem seems to persist but with a
                      different signature.  Graphs attached as before.<br class="">
                      >><br class="">
                      >> Totals with MPICH (NB: single run)<br class="">
                      >><br class="">
                      >> For the CG/Jacobi         
                      data_exchange_total = 41,385,984; kspsolve_total =
                      38,289,408<br class="">
                      >> For the GMRES/BJACOBI     
                      data_exchange_total = 41,324,544; kspsolve_total =
                      41,324,544<br class="">
                      >><br class="">
                      >> Just reading the MPI docs I am wondering
                      if I need some sort of MPI_Wait/MPI_Waitall before
                      my MPI_Barrier in the data exchange routine?<br class="">
                      >> I would have thought that with the
                      blocking receives and the MPI_Barrier that
                      everything will have fully completed and cleaned
                      up before<br class="">
                      >> all processes exited the routine, but
                      perhaps I am wrong on that.<br class="">
                      ><br class="">
                      > Skimming the fortran code you sent you do:<br class="">
                      ><br class="">
                      > for i in ...:<br class="">
                      >     call MPI_Isend(..., req, ierr)<br class="">
                      ><br class="">
                      > for i in ...:<br class="">
                      >     call MPI_Recv(..., ierr)<br class="">
                      ><br class="">
                      > But you never call MPI_Wait on the request
                      you got back from the Isend. So the MPI library
                      will never free the data structures it created.<br class="">
                      ><br class="">
                      > The usual pattern for these non-blocking
                      communications is to allocate an array for the
                      requests of length nsend+nrecv and then do:<br class="">
                      ><br class="">
                      > for i in nsend:<br class="">
                      >     call MPI_Isend(..., req[i], ierr)<br class="">
                      > for j in nrecv:<br class="">
                      >     call MPI_Irecv(..., req[nsend+j], ierr)<br class="">
                      ><br class="">
                      > call MPI_Waitall(req, ..., ierr)<br class="">
                      ><br class="">
                      > I note also there's no need for the Barrier
                      at the end of the routine, this kind of
                      communication does neighbourwise synchronisation,
                      no need to add (unnecessary) global
                      synchronisation too.<br class="">
                      ><br class="">
                      > As an aside, is there a reason you don't use
                      PETSc's VecScatter to manage this global to local
                      exchange?<br class="">
                      ><br class="">
                      > Cheers,<br class="">
                      ><br class="">
                      > Lawrence<br class="">
                      <br class="">
                    </blockquote>
                  </div>
                </div>
              </blockquote>
              <br class="">
            </div>
          </blockquote>
        </div>
        <br clear="all" class="">
        <div class=""><br class="">
        </div>
        -- <br class="">
        <div dir="ltr" class="gmail_signature">
          <div dir="ltr" class="">
            <div class="">
              <div dir="ltr" class="">
                <div class="">
                  <div dir="ltr" class="">
                    <div class="">What most experimenters take for granted before
                      they begin their experiments is infinitely more
                      interesting than any results to which their
                      experiments lead.<br class="">
                      -- Norbert Wiener</div>
                    <div class=""><br class="">
                    </div>
                    <div class=""><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" moz-do-not-send="true" class="">https://www.cse.buffalo.edu/~knepley/</a><br class="">
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br class="">
  </div>

</div></blockquote></div><br class=""></body></html>