[petsc-users] Memory growth issue

Fri May 31 15:10:00 CDT 2019

Yes, the issue is running out of memory on long runs.
Perhaps some clean-up happens latter when the memory pressure builds but 
that
is a bit non-ideal.

-sanjay

On 5/31/19 12:53 PM, Zhang, Junchao wrote:
> Sanjay,
> I tried petsc with MPICH and OpenMPI on my Macbook. I 
> inserted PetscMemoryGetCurrentUsage/PetscMallocGetCurrentUsage at the 
> beginning and end of KSPSolve and then computed the delta and summed 
> over processes. Then I tested 
> with src/ts/examples/tutorials/advection-diffusion-reaction/ex5.c
> With OpenMPI,
> mpirun -n 4 ./ex5 -da_grid_x 128 -da_grid_y 128 -ts_type beuler 
> -ts_max_steps 500 > 128.log
> grep -n -v "RSS Delta=         0, Malloc Delta= 0" 128.log
> 1:RSS Delta=     69632, Malloc Delta=         0
> 2:RSS Delta=     69632, Malloc Delta=         0
> 3:RSS Delta=     69632, Malloc Delta=         0
> 4:RSS Delta=     69632, Malloc Delta=         0
> 9:RSS Delta=9.25286e+06, Malloc Delta=         0
> 22:RSS Delta=     49152, Malloc Delta=         0
> 44:RSS Delta=     20480, Malloc Delta=         0
> 53:RSS Delta=     49152, Malloc Delta=         0
> 66:RSS Delta=      4096, Malloc Delta=         0
> 97:RSS Delta=     16384, Malloc Delta=         0
> 119:RSS Delta=     20480, Malloc Delta=         0
> 141:RSS Delta=     53248, Malloc Delta=         0
> 176:RSS Delta=     16384, Malloc Delta=         0
> 308:RSS Delta=     16384, Malloc Delta=         0
> 352:RSS Delta=     16384, Malloc Delta=         0
> 550:RSS Delta=     16384, Malloc Delta=         0
> 572:RSS Delta=     16384, Malloc Delta=         0
> 669:RSS Delta=     40960, Malloc Delta=         0
> 924:RSS Delta=     32768, Malloc Delta=         0
> 1694:RSS Delta=     20480, Malloc Delta=         0
> 2099:RSS Delta=     16384, Malloc Delta=         0
> 2244:RSS Delta=     20480, Malloc Delta=         0
> 3001:RSS Delta=     16384, Malloc Delta=         0
> 5883:RSS Delta=     16384, Malloc Delta=         0
>
> If I increased the grid
> mpirun -n 4 ./ex5 -da_grid_x 512 -da_grid_y 512 -ts_type beuler 
> -ts_max_steps 500 -malloc_test >512.log
> grep -n -v "RSS Delta=         0, Malloc Delta= 0" 512.log
> 1:RSS Delta=1.05267e+06, Malloc Delta=         0
> 2:RSS Delta=1.05267e+06, Malloc Delta=         0
> 3:RSS Delta=1.05267e+06, Malloc Delta=         0
> 4:RSS Delta=1.05267e+06, Malloc Delta=         0
> 13:RSS Delta=1.24932e+08, Malloc Delta=         0
>
> So we did see RSS increase in 4k-page sizes after KSPSolve. As long as 
> no memory leaks, why do you care about it? Is it because you run out 
> of memory?
>
> On Thu, May 30, 2019 at 1:59 PM Smith, Barry F. <bsmith at mcs.anl.gov 
> <mailto:bsmith at mcs.anl.gov>> wrote:
>
>
>        Thanks for the update. So the current conclusions are that
>     using the Waitall in your code
>
>     1) solves the memory issue with OpenMPI in your code
>
>     2) does not solve the memory issue with PETSc KSPSolve
>
>     3) MPICH has memory issues both for your code and PETSc KSPSolve
>     (despite) the wait all fix?
>
>     If you literately just comment out the call to KSPSolve() with
>     OpenMPI is there no growth in memory usage?
>
>
>     Both 2 and 3 are concerning, indicate possible memory leak bugs in
>     MPICH and not freeing all MPI resources in KSPSolve()
>
>     Junchao, can you please investigate 2 and 3 with, for example, a
>     TS example that uses the linear solver (like with -ts_type
>     beuler)? Thanks
>
>
>       Barry
>
>
>
>     > On May 30, 2019, at 1:47 PM, Sanjay Govindjee <s_g at berkeley.edu
>     <mailto:s_g at berkeley.edu>> wrote:
>     >
>     > Lawrence,
>     > Thanks for taking a look!  This is what I had been wondering
>     about -- my knowledge of MPI is pretty minimal and
>     > this origins of the routine were from a programmer we hired a
>     decade+ back from NERSC.  I'll have to look into
>     > VecScatter.  It will be great to dispense with our roll-your-own
>     routines (we even have our own reduceALL scattered around the code).
>     >
>     > Interestingly, the MPI_WaitALL has solved the problem when using
>     OpenMPI but it still persists with MPICH.  Graphs attached.
>     > I'm going to run with openmpi for now (but I guess I really
>     still need to figure out what is wrong with MPICH and WaitALL;
>     > I'll try Barry's suggestion of
>     --download-mpich-configure-arguments="--enable-error-messages=all
>     --enable-g" later today and report back).
>     >
>     > Regarding MPI_Barrier, it was put in due a problem that some
>     processes were finishing up sending and receiving and exiting the
>     subroutine
>     > before the receiving processes had completed (which resulted in
>     data loss as the buffers are freed after the call to the routine).
>     MPI_Barrier was the solution proposed
>     > to us.  I don't think I can dispense with it, but will think
>     about some more.
>     >
>     > I'm not so sure about using MPI_IRecv as it will require a bit
>     of rewriting since right now I process the received
>     > data sequentially after each blocking MPI_Recv -- clearly slower
>     but easier to code.
>     >
>     > Thanks again for the help.
>     >
>     > -sanjay
>     >
>     > On 5/30/19 4:48 AM, Lawrence Mitchell wrote:
>     >> Hi Sanjay,
>     >>
>     >>> On 30 May 2019, at 08:58, Sanjay Govindjee via petsc-users
>     <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>     >>>
>     >>> The problem seems to persist but with a different signature. 
>     Graphs attached as before.
>     >>>
>     >>> Totals with MPICH (NB: single run)
>     >>>
>     >>> For the CG/Jacobi          data_exchange_total = 41,385,984;
>     kspsolve_total = 38,289,408
>     >>> For the GMRES/BJACOBI      data_exchange_total = 41,324,544;
>     kspsolve_total = 41,324,544
>     >>>
>     >>> Just reading the MPI docs I am wondering if I need some sort
>     of MPI_Wait/MPI_Waitall before my MPI_Barrier in the data exchange
>     routine?
>     >>> I would have thought that with the blocking receives and the
>     MPI_Barrier that everything will have fully completed and cleaned
>     up before
>     >>> all processes exited the routine, but perhaps I am wrong on that.
>     >>
>     >> Skimming the fortran code you sent you do:
>     >>
>     >> for i in ...:
>     >>    call MPI_Isend(..., req, ierr)
>     >>
>     >> for i in ...:
>     >>    call MPI_Recv(..., ierr)
>     >>
>     >> But you never call MPI_Wait on the request you got back from
>     the Isend. So the MPI library will never free the data structures
>     it created.
>     >>
>     >> The usual pattern for these non-blocking communications is to
>     allocate an array for the requests of length nsend+nrecv and then do:
>     >>
>     >> for i in nsend:
>     >>    call MPI_Isend(..., req[i], ierr)
>     >> for j in nrecv:
>     >>    call MPI_Irecv(..., req[nsend+j], ierr)
>     >>
>     >> call MPI_Waitall(req, ..., ierr)
>     >>
>     >> I note also there's no need for the Barrier at the end of the
>     routine, this kind of communication does neighbourwise
>     synchronisation, no need to add (unnecessary) global
>     synchronisation too.
>     >>
>     >> As an aside, is there a reason you don't use PETSc's VecScatter
>     to manage this global to local exchange?
>     >>
>     >> Cheers,
>     >>
>     >> Lawrence
>     >
>     > <cg_mpichwall.png><cg_wall.png><gmres_mpichwall.png><gmres_wall.png>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190531/e25e9b2a/attachment.html>