[petsc-users] Memory growth issue

Sanjay Govindjee s_g at berkeley.edu
Thu May 30 02:01:18 CDT 2019


I put in calls to PetscMemoryGetCurrentUsage( ) around KSPSolve and my 
data exchange routine.  The problem is clearly mostly in my data 
exchange routine.
Attached are graphs of the change in memory  for each call.  Lots of 
calls have zero change, but on a periodic regular basis the memory goes 
up from the data exchange; much less
so with the KSPSolve calls (and then mostly on the first calls).

For the CG/Jacobi          data_exchange_total = 21,311,488; 
kspsolve_total = 2,625,536
For the GMRES/BJACOBI data_exchange_total = 6,619,136; kspsolve_total = 
54,403,072 (dominated by initial calls)

I will try to switch up my MPI to see if anything changes; right now my 
configure is with  --download-openmpi.
I've also attached the data exchange routine in case there is something 
obviously wrong.

NB: Graphs/Data are from just one run each.

-sanjay

On 5/29/19 10:17 PM, Smith, Barry F. wrote:
>     This is indeed worrisome.
>
>      Would it be possible to put PetscMemoryGetCurrentUsage() around each call to KSPSolve() and each call to your data exchange? See if at each step they increase?
>
>      One thing to be aware of with "max resident set size" is that it measures the number of pages that have been set up. Not the amount of memory allocated. So, if, for example, you allocate a very large array but don't actually read or write the memory in that array until later in the code it won't appear in the "resident set size" until you read or write the memory (because Unix doesn't set up pages until it needs to).
>
>     You should also try another MPI. Both OpenMPI and MPICH can be installed with brew or you can use --download-mpich or --download-openmp to see if the MPI implementation is making a difference.
>
>      For now I would focus on the PETSc only solvers to eliminate one variable from the equation; once that is understood you can go back to the question of the memory management of the other solvers
>
>    Barry
>
>
>> On May 29, 2019, at 11:51 PM, Sanjay Govindjee via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>
>> I am trying to track down a memory issue with my code; apologies in advance for the longish message.
>>
>> I am solving a FEA problem with a number of load steps involving about 3000
>> right hand side and tangent assemblies and solves.  The program is mainly Fortran, with a C memory allocator.
>>
>> When I run my code in strictly serial mode (no Petsc or MPI routines) the memory stays constant over the whole run.
>>
>> When I run it in parallel mode with petsc solvers with num_processes=1, the memory (max resident set size) also stays constant:
>>
>> PetscMalloc = 28,976, ProgramNativeMalloc = constant, Resident Size = 24,854,528 (constant) [CG/JACOBI]
>>
>> [PetscMalloc and Resident Size as reported by PetscMallocGetCurrentUsage and PetscMemoryGetCurrentUsage (and summed across processes as needed);
>> ProgramNativeMalloc reported by program memory allocator.]
>>
>> When I run it in parallel mode with petsc solvers but num_processes=2, the resident memory grows steadily during the run:
>>
>> PetscMalloc = 3,039,072 (constant), ProgramNativeMalloc = constant, Resident Size = (finish) 31,313,920 (start) 24,698,880 [CG/JACOBI]
>>
>> When I run it in parallel mode with petsc solvers but num_processes=4, the resident memory grows steadily during the run:
>>
>> PetscMalloc = 3,307,888 (constant), ProgramNativeMalloc = 1,427,584 (constant), Resident Size = (finish) 70,787,072  (start) 45,801,472 [CG/JACOBI]
>> PetscMalloc = 5,903,808 (constant), ProgramNativeMalloc = 1,427,584 (constant), Resident Size = (finish) 112,410,624 (start) 52,076,544 [GMRES/BJACOBI]
>> PetscMalloc = 3,188,944 (constant), ProgramNativeMalloc = 1,427,584 (constant), Resident Size = (finish) 712,798,208 (start) 381,480,960 [SUPERLU]
>> PetscMalloc = 6,539,408 (constant), ProgramNativeMalloc = 1,427,584 (constant), Resident Size = (finish) 591,048,704 (start) 278,671,360 [MUMPS]
>>
>> The memory growth feels alarming but maybe I do not understand the values in ru_maxrss from getrusage().
>>
>> My box (MacBook Pro) has a broken Valgrind so I need to get to a system with a functional one; notwithstanding, the code has always been Valgrind clean.
>> There are no Fortran Pointers or Fortran Allocatable arrays in the part of the code being used.  The program's C memory allocator keeps track of
>> itself so I do not see that the problem is there.  The Petsc malloc is also steady.
>>
>> Other random hints:
>>
>> 1) If I comment out the call to KSPSolve and to my MPI data-exchange routine (for passing solution values between processes after each solve,
>> use  MPI_Isend, MPI_Recv, MPI_BARRIER)  the memory growth essentially goes away.
>>
>> 2) If I comment out the call to my MPI data-exchange routine but leave the call to KSPSolve the problem remains but is substantially reduced
>> for CG/JACOBI, and is marginally reduced for the GMRES/BJACOBI, SUPERLU, and MUMPS runs.
>>
>> 3) If I comment out the call to KSPSolve but leave the call to my MPI data-exchange routine the problem remains.
>>
>> Any suggestions/hints of where to look will be great.
>>
>> -sanjay
>>
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: cg.png
Type: image/png
Size: 52635 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190530/a320faa1/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gmres.png
Type: image/png
Size: 51046 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190530/a320faa1/attachment-0003.png>
-------------- next part --------------
!$Id:$
      subroutine psetb(b,getp,getv,senp,senv,eq, ndf, rdatabuf,sdatabuf)

!      * * F E A P * * A Finite Element Analysis Program

!....  Copyright (c) 1984-2017: Regents of the University of California
!                               All rights reserved

!-----[--.----+----.----+----.-----------------------------------------]
!     Modification log                                Date (dd/mm/year)
!       Original version                                    01/11/2006
!       1. Revise send/receive data add barrier             24/11/2006
!       2. Correct send/receive and add error messages      16/03/2007
!       3. Change 'include/finclude' to 'finclude'          23/01/2009
!       4. Remove common 'pfeapa' (values in 'setups')      05/02/2009
!       5. finclude -> petsc/finclude                       12/05/2016
!       6. Update for PETSc 3.8.3                           28/02/2018
!       7. Change 'id' to 'eq'                              23/05/2019
!-----[--.----+----.----+----.-----------------------------------------]
!     Purpose:  Transfer PETSC vector to local arrays including ghost
!               node data via MPI messaging

!     Inputs:
!        getp(ntasks)    - Pointer array for getting ghost data
!        getv(sum(getp)) - Local node numbers for getting ghost data
!        senp(ntasks)    - Pointer array for sending ghost data
!        senv(sum(senp)) - Local node numbers to send out as ghost data
!        eq(ndf,numnp)   - Local equation numbers
!        ndf             - dofs per node
!        rdatabuf(*)     - receive communication array
!        sdatabuf(*)     - send    communication array

!     Outputs:

!        b(neq) - Local solution vector
!-----[--.----+----.----+----.-----------------------------------------]
#     include   <petsc/finclude/petscsys.h>
      use                       petscsys
      implicit   none


#     include   "cdata.h"
#     include   "iofile.h"
#     include   "pfeapb.h"
#     include   "setups.h"

      integer    ndf
      integer    i, j, k, lnodei, eqi, soff, rbuf,sbuf,tbuf, idesp
      integer    getp(*),getv(*),senp(*),senv(*)
      integer    eq(ndf,*)
      real*8     b(*), rdatabuf(*), sdatabuf(*)

      integer    usolve_msg, req
      parameter (usolve_msg=10)

!     Petsc values

      PetscErrorCode ierr

      integer        msg_stat(MPI_STATUS_SIZE)

!     Sending Data Asynchronously

      soff  = 0
      idesp = 0
      do i = 1, ntasks

        if (senp(i) .gt. 0) then
          sbuf   = soff
          do j = 1, senp(i)
            lnodei = senv(j + idesp)
            do k = 1, ndf
              eqi = eq(k,lnodei)
              if (eqi .gt. 0) then
                sbuf           = sbuf + 1
                sdatabuf(sbuf) = b(eqi)
              endif
            end do ! k
          end do ! j
          idesp = idesp + senp(i)
          sbuf  = sbuf  - soff
          call MPI_Isend( sdatabuf(soff+1), sbuf, MPI_DOUBLE_PRECISION,
     &                    i-1, usolve_msg, MPI_COMM_WORLD, req, ierr )

!         Report send error

          if(ierr.ne.0) then
            write(iow,*) ' -->> Send Error[',rank+1,'->',i,']',ierr
            write(  *,*) ' -->> Send Error[',rank+1,'->',i,']',ierr
          endif

          soff = soff + sbuf
        endif

      end do ! i

!     Receiving Data in blocking mode

      idesp = 0
      do i = 1, ntasks
        if (getp(i) .gt. 0) then

!         Determine receive length

          tbuf = getp(i)*ndf
          call MPI_Recv( rdatabuf, tbuf, MPI_DOUBLE_PRECISION, i-1,
     &                   usolve_msg, MPI_COMM_WORLD,msg_stat, ierr)
          if(ierr.ne.0) then
             write(iow,*) 'Recv Error[',i,'->',rank+1,']',ierr
             write(  *,*) 'Recv Error[',i,'->',rank+1,']',ierr
          endif
          rbuf = 0
          do j = 1, getp(i)
            lnodei = getv(j + idesp)
            do k = 1, ndf
              eqi = eq(k,lnodei)
              if (eqi .gt. 0 ) then
                rbuf     = rbuf + 1
                b( eqi ) = rdatabuf( rbuf )
              endif
            end do ! k
          end do ! j
          idesp = idesp + getp(i)
        endif
      end do ! i

      call MPI_BARRIER(MPI_COMM_WORLD, ierr)


      end


More information about the petsc-users mailing list