[petsc-users] Memory growth issue
Sanjay Govindjee
s_g at berkeley.edu
Thu May 30 22:54:04 CDT 2019
Hi Juanchao,
Thanks for the hints below, they will take some time to absorb as the
vectors that are being moved around
are actually partly petsc vectors and partly local process vectors.
Attached is the modified routine that now works (on leaking memory) with
openmpi.
-sanjay
On 5/30/19 8:41 PM, Zhang, Junchao wrote:
>
> Hi, Sanjay,
> Could you send your modified data exchange code (psetb.F) with
> MPI_Waitall? See other inlined comments below. Thanks.
>
> On Thu, May 30, 2019 at 1:49 PM Sanjay Govindjee via petsc-users
> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>
> Lawrence,
> Thanks for taking a look! This is what I had been wondering about
> -- my
> knowledge of MPI is pretty minimal and
> this origins of the routine were from a programmer we hired a decade+
> back from NERSC. I'll have to look into
> VecScatter. It will be great to dispense with our roll-your-own
> routines (we even have our own reduceALL scattered around the code).
>
> Petsc VecScatter has a very simple interface and you definitely should
> go with. With VecScatter, you can think in familiar vectors and
> indices instead of the low level MPI_Send/Recv. Besides that, PETSc
> has optimized VecScatter so that communication is efficient.
>
>
> Interestingly, the MPI_WaitALL has solved the problem when using
> OpenMPI
> but it still persists with MPICH. Graphs attached.
> I'm going to run with openmpi for now (but I guess I really still
> need
> to figure out what is wrong with MPICH and WaitALL;
> I'll try Barry's suggestion of
> --download-mpich-configure-arguments="--enable-error-messages=all
> --enable-g" later today and report back).
>
> Regarding MPI_Barrier, it was put in due a problem that some
> processes
> were finishing up sending and receiving and exiting the subroutine
> before the receiving processes had completed (which resulted in data
> loss as the buffers are freed after the call to the routine).
> MPI_Barrier was the solution proposed
> to us. I don't think I can dispense with it, but will think about
> some
> more.
>
> After MPI_Send(), or after MPI_Isend(..,req) and MPI_Wait(req), you
> can safely free the send buffer without worry that the receive has not
> completed. MPI guarantees the receiver can get the data, for example,
> through internal buffering.
>
>
> I'm not so sure about using MPI_IRecv as it will require a bit of
> rewriting since right now I process the received
> data sequentially after each blocking MPI_Recv -- clearly slower but
> easier to code.
>
> Thanks again for the help.
>
> -sanjay
>
> On 5/30/19 4:48 AM, Lawrence Mitchell wrote:
> > Hi Sanjay,
> >
> >> On 30 May 2019, at 08:58, Sanjay Govindjee via petsc-users
> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
> >>
> >> The problem seems to persist but with a different signature.
> Graphs attached as before.
> >>
> >> Totals with MPICH (NB: single run)
> >>
> >> For the CG/Jacobi data_exchange_total = 41,385,984;
> kspsolve_total = 38,289,408
> >> For the GMRES/BJACOBI data_exchange_total = 41,324,544;
> kspsolve_total = 41,324,544
> >>
> >> Just reading the MPI docs I am wondering if I need some sort of
> MPI_Wait/MPI_Waitall before my MPI_Barrier in the data exchange
> routine?
> >> I would have thought that with the blocking receives and the
> MPI_Barrier that everything will have fully completed and cleaned
> up before
> >> all processes exited the routine, but perhaps I am wrong on that.
> >
> > Skimming the fortran code you sent you do:
> >
> > for i in ...:
> > call MPI_Isend(..., req, ierr)
> >
> > for i in ...:
> > call MPI_Recv(..., ierr)
> >
> > But you never call MPI_Wait on the request you got back from the
> Isend. So the MPI library will never free the data structures it
> created.
> >
> > The usual pattern for these non-blocking communications is to
> allocate an array for the requests of length nsend+nrecv and then do:
> >
> > for i in nsend:
> > call MPI_Isend(..., req[i], ierr)
> > for j in nrecv:
> > call MPI_Irecv(..., req[nsend+j], ierr)
> >
> > call MPI_Waitall(req, ..., ierr)
> >
> > I note also there's no need for the Barrier at the end of the
> routine, this kind of communication does neighbourwise
> synchronisation, no need to add (unnecessary) global
> synchronisation too.
> >
> > As an aside, is there a reason you don't use PETSc's VecScatter
> to manage this global to local exchange?
> >
> > Cheers,
> >
> > Lawrence
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190530/dca39bc1/attachment.html>
-------------- next part --------------
!$Id:$
subroutine psetb(b,getp,getv,senp,senv,eq, ndf, rdatabuf,sdatabuf)
! * * F E A P * * A Finite Element Analysis Program
!.... Copyright (c) 1984-2017: Regents of the University of California
! All rights reserved
!-----[--.----+----.----+----.-----------------------------------------]
! Modification log Date (dd/mm/year)
! Original version 01/11/2006
! 1. Revise send/receive data add barrier 24/11/2006
! 2. Correct send/receive and add error messages 16/03/2007
! 3. Change 'include/finclude' to 'finclude' 23/01/2009
! 4. Remove common 'pfeapa' (values in 'setups') 05/02/2009
! 5. finclude -> petsc/finclude 12/05/2016
! 6. Update for PETSc 3.8.3 28/02/2018
! 7. Change 'id' to 'eq' 23/05/2019
!-----[--.----+----.----+----.-----------------------------------------]
! Purpose: Transfer PETSC vector to local arrays including ghost
! node data via MPI messaging
! Inputs:
! getp(ntasks) - Pointer array for getting ghost data
! getv(sum(getp)) - Local node numbers for getting ghost data
! senp(ntasks) - Pointer array for sending ghost data
! senv(sum(senp)) - Local node numbers to send out as ghost data
! eq(ndf,numnp) - Local equation numbers
! ndf - dofs per node
! rdatabuf(*) - receive communication array
! sdatabuf(*) - send communication array
! Outputs:
! b(neq) - Local solution vector
!-----[--.----+----.----+----.-----------------------------------------]
# include <petsc/finclude/petscsys.h>
use petscsys
implicit none
# include "cdata.h"
# include "iofile.h"
# include "pfeapb.h"
# include "setups.h"
integer ndf
integer i, j, k, lnodei, eqi, soff, rbuf,sbuf,tbuf, idesp
integer getp(*),getv(*),senp(*),senv(*)
integer eq(ndf,*)
real*8 b(*), rdatabuf(*), sdatabuf(*)
integer usolve_msg, req(ntasks),reqcnt
parameter (usolve_msg=10)
! Petsc values
PetscErrorCode ierr
integer msg_stat(MPI_STATUS_SIZE)
! Sending Data Asynchronously
soff = 0
idesp = 0
req(:) = 0
reqcnt = 0
do i = 1, ntasks
if (senp(i) .gt. 0) then
sbuf = soff
do j = 1, senp(i)
lnodei = senv(j + idesp)
do k = 1, ndf
eqi = eq(k,lnodei)
if (eqi .gt. 0) then
sbuf = sbuf + 1
sdatabuf(sbuf) = b(eqi)
endif
end do ! k
end do ! j
idesp = idesp + senp(i)
sbuf = sbuf - soff
reqcnt = reqcnt + 1
call MPI_Isend( sdatabuf(soff+1), sbuf, MPI_DOUBLE_PRECISION,
& i-1, usolve_msg, MPI_COMM_WORLD, req(reqcnt),
& ierr)
! Report send error
if(ierr.ne.0) then
write(iow,*) ' -->> Send Error[',rank+1,'->',i,']',ierr
write( *,*) ' -->> Send Error[',rank+1,'->',i,']',ierr
endif
soff = soff + sbuf
endif
end do ! i
! Receiving Data in blocking mode
idesp = 0
do i = 1, ntasks
if (getp(i) .gt. 0) then
! Determine receive length
tbuf = getp(i)*ndf
call MPI_Recv( rdatabuf, tbuf, MPI_DOUBLE_PRECISION, i-1,
& usolve_msg, MPI_COMM_WORLD,msg_stat, ierr)
if(ierr.ne.0) then
write(iow,*) 'Recv Error[',i,'->',rank+1,']',ierr
write( *,*) 'Recv Error[',i,'->',rank+1,']',ierr
endif
rbuf = 0
do j = 1, getp(i)
lnodei = getv(j + idesp)
do k = 1, ndf
eqi = eq(k,lnodei)
if (eqi .gt. 0 ) then
rbuf = rbuf + 1
b( eqi ) = rdatabuf( rbuf )
endif
end do ! k
end do ! j
idesp = idesp + getp(i)
endif
end do ! i
call MPI_WaitALL(reqcnt,req,MPI_STATUSES_IGNORE,ierr)
call MPI_BARRIER(MPI_COMM_WORLD, ierr)
end
More information about the petsc-users
mailing list