[petsc-dev] Valgrind defect: memory leak with PetscCommDuplicate?

Barry Smith bsmith at mcs.anl.gov
Fri Feb 6 15:35:53 CST 2015


   I ran PETc master with OpenMPI 1.6 using your test example with MPI_COMM_WORLD and got only the  memory leaks after my name below.
If I change and do a MPI_Comm_dup() I get an additional one 

1,192 (112 direct, 1,080 indirect) bytes in 1 blocks are definitely lost in loss record 38 of 38
==11995==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11995==    by 0x91F9BFA: opal_obj_new (opal_object.h:467)
==11995==    by 0x91F9C71: ompi_attr_hash_init (attribute.h:189)
==11995==    by 0x91FB446: set_value (attribute.c:1120)
==11995==    by 0x91FA381: ompi_attr_set_c (attribute.c:672)
==11995==    by 0x922D4E0: PMPI_Attr_put (pattr_put.c:58)
==11995==    by 0x4F59FDD: PetscCommDuplicate (tagm.c:162)
==11995==    by 0x4F5FA31: PetscHeaderCreate_Private (inherit.c:59)
==11995==    by 0x57B563E: MatCreate (gcreate.c:62)
==11995==    by 0x57C2C77: matcreate_ (gcreatef.c:58)
==11995==    by 0x401C79: MAIN__ (ex1f.F:124)
==11995==    by 0x4024F3: main (ex1f.F:320)

but if I properly call MPI_Comm_free() on that communicator before PetscFinalize() this leak goes away.

When I run any version under MPICH 3.1.3  there are no memory leaks.




==11423== 1 bytes in 1 blocks are definitely lost in loss record 1 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0xC325AC4: ???
==11423==    by 0x99B1BFE: opal_db_base_store (db_base_fns.c:49)
==11423==    by 0x9297DA4: ompi_rte_db_store (rte_orte_module.c:213)
==11423==    by 0x9218AB5: ompi_modex_send_string (ompi_module_exchange.c:119)
==11423==    by 0x921511F: ompi_mpi_init (ompi_mpi_init.c:511)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)
==11423== 
==11423== 3 bytes in 1 blocks are definitely lost in loss record 2 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0x6C86D81: strdup (strdup.c:43)
==11423==    by 0xC32582D: ???
==11423==    by 0x99B1BFE: opal_db_base_store (db_base_fns.c:49)
==11423==    by 0xA09A90E: orte_util_decode_nodemap (nidmap.c:446)
==11423==    by 0xA099C8C: orte_util_nidmap_init (nidmap.c:127)
==11423==    by 0xBB17403: ???
==11423==    by 0xA07BEAC: orte_init (orte_init.c:148)
==11423==    by 0x9214F75: ompi_mpi_init (ompi_mpi_init.c:464)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)
==11423== 
==11423== 3 bytes in 1 blocks are definitely lost in loss record 3 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0x6C86D81: strdup (strdup.c:43)
==11423==    by 0xC32582D: ???
==11423==    by 0x99B1BFE: opal_db_base_store (db_base_fns.c:49)
==11423==    by 0xA09AA31: orte_util_decode_nodemap (nidmap.c:469)
==11423==    by 0xA099C8C: orte_util_nidmap_init (nidmap.c:127)
==11423==    by 0xBB17403: ???
==11423==    by 0xA07BEAC: orte_init (orte_init.c:148)
==11423==    by 0x9214F75: ompi_mpi_init (ompi_mpi_init.c:464)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)
==11423== 
==11423== 5 bytes in 1 blocks are definitely lost in loss record 5 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0x6C86D81: strdup (strdup.c:43)
==11423==    by 0xC32582D: ???
==11423==    by 0x99B1BFE: opal_db_base_store (db_base_fns.c:49)
==11423==    by 0xA09C53D: orte_util_decode_pidmap (nidmap.c:1034)
==11423==    by 0xA099D28: orte_util_nidmap_init (nidmap.c:141)
==11423==    by 0xBB17403: ???
==11423==    by 0xA07BEAC: orte_init (orte_init.c:148)
==11423==    by 0x9214F75: ompi_mpi_init (ompi_mpi_init.c:464)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)
==11423== 
==11423== 5 bytes in 1 blocks are definitely lost in loss record 6 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0x6C86D81: strdup (strdup.c:43)
==11423==    by 0xC32582D: ???
==11423==    by 0x99B1BFE: opal_db_base_store (db_base_fns.c:49)
==11423==    by 0xA0AA624: orte_ess_base_proc_binding (ess_base_fns.c:298)
==11423==    by 0xBB1743F: ???
==11423==    by 0xA07BEAC: orte_init (orte_init.c:148)
==11423==    by 0x9214F75: ompi_mpi_init (ompi_mpi_init.c:464)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)
==11423== 
==11423== 24 bytes in 1 blocks are definitely lost in loss record 9 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0xC325AC4: ???
==11423==    by 0x99B1BFE: opal_db_base_store (db_base_fns.c:49)
==11423==    by 0x9297DA4: ompi_rte_db_store (rte_orte_module.c:213)
==11423==    by 0x9218953: ompi_modex_send (ompi_module_exchange.c:49)
==11423==    by 0xEDC7ECD: ???
==11423==    by 0xEDC8107: ???
==11423==    by 0x9279321: mca_btl_base_select (btl_base_select.c:111)
==11423==    by 0xE168357: ???
==11423==    by 0x9278796: mca_bml_base_init (bml_base_init.c:69)
==11423==    by 0xFA8E3C7: ???
==11423==    by 0x9295BAA: mca_pml_base_select (pml_base_select.c:128)
==11423==    by 0x9215359: ompi_mpi_init (ompi_mpi_init.c:604)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)
==11423== 
==11423== 25 bytes in 2 blocks are definitely lost in loss record 10 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0x996ECB7: opal_dss_copy_value (dss_copy.c:317)
==11423==    by 0x996E51F: opal_dss_copy (dss_copy.c:42)
==11423==    by 0xC32630E: ???
==11423==    by 0x99B203D: opal_db_base_fetch_multiple (db_base_fns.c:203)
==11423==    by 0xA0AD3CB: orte_grpcomm_base_pack_modex_entries (grpcomm_base_modex.c:308)
==11423==    by 0xA0AD0C0: orte_grpcomm_base_modex (grpcomm_base_modex.c:191)
==11423==    by 0x99B621D: event_process_active_single_queue (event.c:1367)
==11423==    by 0x99B6491: event_process_active (event.c:1437)
==11423==    by 0x99B6B0C: opal_libevent2021_event_base_loop (event.c:1645)
==11423==    by 0xA0A8119: orte_progress_thread_engine (ess_base_std_app.c:436)
==11423==    by 0x8480E99: start_thread (pthread_create.c:308)
==11423== 
==11423== 25 bytes in 2 blocks are definitely lost in loss record 11 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0x9976A34: opal_dss_unpack_value (dss_unpack.c:975)
==11423==    by 0x9972B07: opal_dss_unpack_buffer (dss_unpack.c:126)
==11423==    by 0x9972A29: opal_dss_unpack (dss_unpack.c:90)
==11423==    by 0xA0AD1D9: orte_grpcomm_base_store_modex (grpcomm_base_modex.c:248)
==11423==    by 0xA0ADD20: app_recv (grpcomm_base_receive.c:203)
==11423==    by 0xA0D2CF6: orte_rml_base_process_msg (rml_base_msg_handlers.c:172)
==11423==    by 0x99B621D: event_process_active_single_queue (event.c:1367)
==11423==    by 0x99B6491: event_process_active (event.c:1437)
==11423==    by 0x99B6B0C: opal_libevent2021_event_base_loop (event.c:1645)
==11423==    by 0xA0A8119: orte_progress_thread_engine (ess_base_std_app.c:436)
==11423==    by 0x8480E99: start_thread (pthread_create.c:308)
==11423== 
==11423== 376 (232 direct, 144 indirect) bytes in 1 blocks are definitely lost in loss record 33 of 35
==11423==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11423==    by 0xE1661E8: ???
==11423==    by 0xE166A39: ???
==11423==    by 0xFA8CCD5: ???
==11423==    by 0x921581B: ompi_mpi_init (ompi_mpi_init.c:771)
==11423==    by 0x92454F0: PMPI_Init (pinit.c:84)
==11423==    by 0x69EC8F9: PMPI_INIT (pinit_f.c:82)
==11423==    by 0x4FA92E2: petscinitialize_ (zstart.c:297)
==11423==    by 0x401AFF: MAIN__ (ex1f.F:102)
==11423==    by 0x402481: main (ex1f.F:318)


> On Feb 6, 2015, at 1:17 PM, Brendan Kochunas <bkochuna at umich.edu> wrote:
> 
> Hi Barry/Boyce,
> 
> I did a bit more digging and was able to reproduce this defect using a  slightly modified version of ex2f.F located at: http://www.mcs.anl.gov/petsc/petsc-3.3/src/ksp/ksp/examples/tutorials/ex2f.F
> 
> I have pasted this modified version below. Essentially the modification replaces PETSC_COMM_WORLD with a duplicate comm of MPI_COMM_SELF in each of the Mat/Vec/KSPCreate calls. (tested with 1 mpi proc)
> 
> I'll note that the defect was NOT reported if a duplicate of PETSC_COMM_WORLD was used or if MPI_COMM_WORLD was used or a duplicate of it was used.
> 
> It should be worthwhile to test this modified example against mpich to confirm that the OpenMPI is the culprit in this case.
> 
> I have had similar experiences between MPICH and OpenMPI, so I would not be surprised in the least if OpenMPI is the source of the leak.
> 
> Thanks,
> -Brendan
> 
> !
> !  Description: Solves a linear system in parallel with KSP (Fortran code).
> !               Also shows how to set a user-defined monitoring routine.
> !
> !  Program usage: mpiexec -n <procs> ex2f [-help] [all PETSc options]
> !
> !/*T
> !  Concepts: KSP^basic parallel example
> !  Concepts: KSP^setting a user-defined monitoring routine
> !  Processors: n
> !T*/
> !
> ! -----------------------------------------------------------------------
> 
>      program main
>      implicit none
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                    Include files
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !
> !  This program uses CPP for preprocessing, as indicated by the use of
> !  PETSc include files in the directory petsc/include/finclude. This
> !  convention enables use of the CPP preprocessor, which allows the use
> !  of the #include statements that define PETSc objects and variables.
> !
> !  Use of the conventional Fortran include statements is also supported
> !  In this case, the PETsc include files are located in the directory
> !  petsc/include/foldinclude.
> !
> !  Since one must be very careful to include each file no more than once
> !  in a Fortran routine, application programmers must exlicitly list
> !  each file needed for the various PETSc components within their
> !  program (unlike the C/C++ interface).
> !
> !  See the Fortran section of the PETSc users manual for details.
> !
> !  The following include statements are required for KSP Fortran programs:
> !     petscsys.h       - base PETSc routines
> !     petscvec.h    - vectors
> !     petscmat.h    - matrices
> !     petscpc.h     - preconditioners
> !     petscksp.h    - Krylov subspace methods
> !  Additional include statements may be needed if using additional
> !  PETSc routines in a Fortran program, e.g.,
> !     petscviewer.h - viewers
> !     petscis.h     - index sets
> !
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscpc.h>
> #include <finclude/petscksp.h>
> !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                   Variable declarations
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !
> !  Variables:
> !     ksp     - linear solver context
> !     ksp      - Krylov subspace method context
> !     pc       - preconditioner context
> !     x, b, u  - approx solution, right-hand-side, exact solution vectors
> !     A        - matrix that defines linear system
> !     its      - iterations for convergence
> !     norm     - norm of error in solution
> !     rctx     - random number generator context
> !
> !  Note that vectors are declared as PETSc "Vec" objects.  These vectors
> !  are mathematical objects that contain more than just an array of
> !  double precision numbers. I.e., vectors in PETSc are not just
> !        double precision x(*).
> !  However, local vector data can be easily accessed via VecGetArray().
> !  See the Fortran section of the PETSc users manual for details.
> !
>      integer testComm
>      double precision  norm
>      PetscInt  i,j,II,JJ,m,n,its
>      PetscInt  Istart,Iend,ione
>      PetscErrorCode ierr
>      PetscMPIInt     rank,size
>      PetscBool   flg
>      PetscScalar v,one,neg_one
>      Vec         x,b,u
>      Mat         A
>      KSP         ksp
>      PetscRandom rctx
> 
> !  These variables are not currently used.
> !      PC          pc
> !      PCType      ptype
> !      double precision tol
> 
> 
> !  Note: Any user-defined Fortran routines (such as MyKSPMonitor)
> !  MUST be declared as external.
> 
>      external MyKSPMonitor,MyKSPConverged
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                 Beginning of program
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
>      call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>      m = 3
>      n = 3
>      one  = 1.0
>      neg_one = -1.0
>      ione    = 1
>      call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-m',m,flg,ierr)
>      call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-n',n,flg,ierr)
>      call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>      call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> !      call MPI_Comm_dup(MPI_COMM_SELF,testComm,ierr)
>      testComm=MPI_COMM_SELF
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !      Compute the matrix and right-hand-side vector that define
> !      the linear system, Ax = b.
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> !  Create parallel matrix, specifying only its global dimensions.
> !  When using MatCreate(), the matrix format can be specified at
> !  runtime. Also, the parallel partitioning of the matrix is
> !  determined by PETSc at runtime.
> 
>      call MatCreate(testComm,A,ierr)
>      call MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,m*n,m*n,ierr)
>      call MatSetFromOptions(A,ierr)
>      call MatSetUp(A,ierr)
> 
> !  Currently, all PETSc parallel matrix formats are partitioned by
> !  contiguous chunks of rows across the processors.  Determine which
> !  rows of the matrix are locally owned.
> 
>      call MatGetOwnershipRange(A,Istart,Iend,ierr)
> 
> !  Set matrix elements for the 2-D, five-point stencil in parallel.
> !   - Each processor needs to insert only elements that it owns
> !     locally (but any non-local elements will be sent to the
> !     appropriate processor during matrix assembly).
> !   - Always specify global row and columns of matrix entries.
> !   - Note that MatSetValues() uses 0-based row and column numbers
> !     in Fortran as well as in C.
> 
> !     Note: this uses the less common natural ordering that orders first
> !     all the unknowns for x = h then for x = 2h etc; Hence you see JH = II +- n
> !     instead of JJ = II +- m as you might expect. The more standard ordering
> !     would first do all variables for y = h, then y = 2h etc.
> 
>      do 10, II=Istart,Iend-1
>        v = -1.0
>        i = II/n
>        j = II - i*n
>        if (i.gt.0) then
>          JJ = II - n
>          call MatSetValues(A,ione,II,ione,JJ,v,INSERT_VALUES,ierr)
>        endif
>        if (i.lt.m-1) then
>          JJ = II + n
>          call MatSetValues(A,ione,II,ione,JJ,v,INSERT_VALUES,ierr)
>        endif
>        if (j.gt.0) then
>          JJ = II - 1
>          call MatSetValues(A,ione,II,ione,JJ,v,INSERT_VALUES,ierr)
>        endif
>        if (j.lt.n-1) then
>          JJ = II + 1
>          call MatSetValues(A,ione,II,ione,JJ,v,INSERT_VALUES,ierr)
>        endif
>        v = 4.0
>        call  MatSetValues(A,ione,II,ione,II,v,INSERT_VALUES,ierr)
> 10   continue
> 
> !  Assemble matrix, using the 2-step process:
> !       MatAssemblyBegin(), MatAssemblyEnd()
> !  Computations can be done while messages are in transition,
> !  by placing code between these two statements.
> 
>      call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)
>      call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)
> 
> !  Create parallel vectors.
> !   - Here, the parallel partitioning of the vector is determined by
> !     PETSc at runtime.  We could also specify the local dimensions
> !     if desired -- or use the more general routine VecCreate().
> !   - When solving a linear system, the vectors and matrices MUST
> !     be partitioned accordingly.  PETSc automatically generates
> !     appropriately partitioned matrices and vectors when MatCreate()
> !     and VecCreate() are used with the same communicator.
> !   - Note: We form 1 vector from scratch and then duplicate as needed.
> 
>      call VecCreateMPI(testComm,PETSC_DECIDE,m*n,u,ierr)
>      call VecSetFromOptions(u,ierr)
>      call VecDuplicate(u,b,ierr)
>      call VecDuplicate(b,x,ierr)
> 
> !  Set exact solution; then compute right-hand-side vector.
> !  By default we use an exact solution of a vector with all
> !  elements of 1.0;  Alternatively, using the runtime option
> !  -random_sol forms a solution vector with random components.
> 
>      call PetscOptionsHasName(PETSC_NULL_CHARACTER,                    &
>     &             "-random_exact_sol",flg,ierr)
>      if (flg) then
>         call PetscRandomCreate(testComm,rctx,ierr)
>         call PetscRandomSetFromOptions(rctx,ierr)
>         call VecSetRandom(u,rctx,ierr)
>         call PetscRandomDestroy(rctx,ierr)
>      else
>         call VecSet(u,one,ierr)
>      endif
>      call MatMult(A,u,b,ierr)
> 
> !  View the exact solution vector if desired
> 
>      call PetscOptionsHasName(PETSC_NULL_CHARACTER,                    &
>     &             "-view_exact_sol",flg,ierr)
>      if (flg) then
>         call VecView(u,PETSC_VIEWER_STDOUT_WORLD,ierr)
>      endif
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !         Create the linear solver and set various options
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> !  Create linear solver context
> 
>      call KSPCreate(testComm,ksp,ierr)
> 
> !  Set operators. Here the matrix that defines the linear system
> !  also serves as the preconditioning matrix.
> 
>      call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr)
> 
> !  Set linear solver defaults for this problem (optional).
> !   - By extracting the KSP and PC contexts from the KSP context,
> !     we can then directly directly call any KSP and PC routines
> !     to set various options.
> !   - The following four statements are optional; all of these
> !     parameters could alternatively be specified at runtime via
> !     KSPSetFromOptions(). All of these defaults can be
> !     overridden at runtime, as indicated below.
> 
> !     We comment out this section of code since the Jacobi
> !     preconditioner is not a good general default.
> 
> !      call KSPGetPC(ksp,pc,ierr)
> !      ptype = PCJACOBI
> !      call PCSetType(pc,ptype,ierr)
> !      tol = 1.e-7
> !      call KSPSetTolerances(ksp,tol,PETSC_DEFAULT_DOUBLE_PRECISION,
> !     & PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr)
> 
> !  Set user-defined monitoring routine if desired
> 
>      call PetscOptionsHasName(PETSC_NULL_CHARACTER,'-my_ksp_monitor',  &
>     &                    flg,ierr)
>      if (flg) then
>        call KSPMonitorSet(ksp,MyKSPMonitor,PETSC_NULL_OBJECT,          &
>     &                     PETSC_NULL_FUNCTION,ierr)
>      endif
> 
> 
> !  Set runtime options, e.g.,
> !      -ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
> !  These options will override those specified above as long as
> !  KSPSetFromOptions() is called _after_ any other customization
> !  routines.
> 
>      call KSPSetFromOptions(ksp,ierr)
> 
> !  Set convergence test routine if desired
> 
>      call PetscOptionsHasName(PETSC_NULL_CHARACTER,                    &
>     &     '-my_ksp_convergence',flg,ierr)
>      if (flg) then
>        call KSPSetConvergenceTest(ksp,MyKSPConverged,                  &
>     &          PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr)
>      endif
> !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                      Solve the linear system
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
>      call KSPSolve(ksp,b,x,ierr)
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                     Check solution and clean up
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> !  Check the error
>      call VecAXPY(x,neg_one,u,ierr)
>      call VecNorm(x,NORM_2,norm,ierr)
>      call KSPGetIterationNumber(ksp,its,ierr)
>      if (rank .eq. 0) then
>        if (norm .gt. 1.e-12) then
>           write(6,100) norm,its
>        else
>           write(6,110) its
>        endif
>      endif
>  100 format('Norm of error ',e11.4,' iterations ',i5)
>  110 format('Norm of error < 1.e-12,iterations ',i5)
> 
> !  Free work space.  All PETSc objects should be destroyed when they
> !  are no longer needed.
> 
>      call KSPDestroy(ksp,ierr)
>      call VecDestroy(u,ierr)
>      call VecDestroy(x,ierr)
>      call VecDestroy(b,ierr)
>      call MatDestroy(A,ierr)
> 
> !  Always call PetscFinalize() before exiting a program.  This routine
> !    - finalizes the PETSc libraries as well as MPI
> !    - provides summary and diagnostic information if certain runtime
> !      options are chosen (e.g., -log_summary).  See PetscFinalize()
> !      manpage for more information.
> 
>      call PetscFinalize(ierr)
>      end
> 
> ! --------------------------------------------------------------
> !
> !  MyKSPMonitor - This is a user-defined routine for monitoring
> !  the KSP iterative solvers.
> !
> !  Input Parameters:
> !    ksp   - iterative context
> !    n     - iteration number
> !    rnorm - 2-norm (preconditioned) residual value (may be estimated)
> !    dummy - optional user-defined monitor context (unused here)
> !
>      subroutine MyKSPMonitor(ksp,n,rnorm,dummy,ierr)
> 
>      implicit none
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscksp.h>
> 
>      KSP              ksp
>      Vec              x
>      PetscErrorCode ierr
>      PetscInt n,dummy
>      PetscMPIInt rank
>      double precision rnorm
> 
> !  Build the solution vector
> 
>      call KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr)
> 
> !  Write the solution vector and residual norm to stdout
> !   - Note that the parallel viewer PETSC_VIEWER_STDOUT_WORLD
> !     handles data from multiple processors so that the
> !     output is not jumbled.
> 
>      call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>      if (rank .eq. 0) write(6,100) n
>      call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)
>      if (rank .eq. 0) write(6,200) n,rnorm
> 
> 100  format('iteration ',i5,' solution vector:')
> 200  format('iteration ',i5,' residual norm ',e11.4)
>      ierr = 0
>      end
> 
> ! --------------------------------------------------------------
> !
> !  MyKSPConverged - This is a user-defined routine for testing
> !  convergence of the KSP iterative solvers.
> !
> !  Input Parameters:
> !    ksp   - iterative context
> !    n     - iteration number
> !    rnorm - 2-norm (preconditioned) residual value (may be estimated)
> !    dummy - optional user-defined monitor context (unused here)
> !
>      subroutine MyKSPConverged(ksp,n,rnorm,flag,dummy,ierr)
> 
>      implicit none
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscksp.h>
> 
>      KSP              ksp
>      PetscErrorCode ierr
>      PetscInt n,dummy
>      KSPConvergedReason flag
>      double precision rnorm
> 
>      if (rnorm .le. .05) then
>        flag = 1
>      else
>        flag = 0
>      endif
>      ierr = 0
> 
>      end
> 
> 
> 
> On 2/6/2015 2:13 PM, Boyce Griffith wrote:
>>> On Feb 6, 2015, at 2:04 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>>  If you don't see any memory leaks directly from PETSc routines then it is a problem in OpenMPI.
>> OpenMPI intentionally does stuff that is not Valgrind clean:
>> 
>> http://www.open-mpi.org/faq/?category=debugging#valgrind_clean
>> 
>> I've personally had mixed success with the OpenMPI-provided suppression file in the past, but I haven't tried it in a while.
>> 
>> -- Boyce
>> 
>>>  We've found that MPICH seems to have less memory leaks than OpenMPI.
>>> 
>>>   Barry
>>> 
>>>> On Feb 6, 2015, at 8:52 AM, Brendan Kochunas <bkochuna at umich.edu> wrote:
>>>> 
>>>> Hi we are trying to clear valgrind defects from our code and presently we are valgrind is reporting memory leaks like the following:
>>>> 
>>>> ==2884==    at 0x4A07EB7: malloc (vg_replace_malloc.c:296)
>>>> ==2884==    by 0x8519874: set_value.isra.0.part.1 (in /gcc-4.6.1/toolset/openmpi-1.4.3/lib/libmpi.so.0.0.2)
>>>> ==2884==    by 0x8547E4D: PMPI_Attr_put (in /gcc-4.6.1/toolset/openmpi-1.4.3/lib/libmpi.so.0.0.2)
>>>> ==2884==    by 0x113A153: PetscCommDuplicate
>>>> ==2884==    by 0x113BFA3: PetscHeaderCreate_Private
>>>> ==2884==    by 0x129ADC6: MatCreate
>>>> ==2884==    by 0x123C355: MatMPIAIJSetPreallocation_MPIAIJ
>>>> ==2884==    by 0x124DAB4: MatMPIAIJSetPreallocation
>>>> ==2884==    by 0x12549B0: MatSetUp_MPIAIJ
>>>> ==2884==    by 0x1191186: MatSetUp
>>>> ==2884==    by 0x10D9333: matsetup_
>>>> 
>>>> and...
>>>> 
>>>> ==2884==    at 0x4A07EB7: malloc (vg_replace_malloc.c:296)
>>>> ==2884==    by 0x8519874: set_value.isra.0.part.1 (in /gcc-4.6.1/toolset/openmpi-1.4.3/lib/libmpi.so.0.0.2)
>>>> ==2884==    by 0x8547E4D: PMPI_Attr_put (in /gcc-4.6.1/toolset/openmpi-1.4.3/lib/libmpi.so.0.0.2)
>>>> ==2884==    by 0x113A22D: PetscCommDuplicate
>>>> ==2884==    by 0x113BFA3: PetscHeaderCreate_Private
>>>> ==2884==    by 0x10ED42B: KSPCreate
>>>> ==2884==    by 0x10DA55C: kspcreate_
>>>> 
>>>> Is the development team aware of any memory leaks that may be coming from PetscCommDuplicate as it may be used in the call stack shown above?
>>>> 
>>>> The version of PETSc we are linking with is 3.3 patch 4. And this is built on OpenMPI v. 1.4.3.
>>>> 
>>>> We are trying to determine if the leak is due to
>>>> 1. Our codes usage of PETSc
>>>> 2. The actual PETSc library
>>>> 3. PETSc's usage of MPI
>>>> 3. The OpenMPI library that PETSc was built against
>>>> 
>>>> Any assistance in being able to point to the culprit or suggestions for particular tests (e.g. a PETSc example) worth trying to identify the root issue would be appreciated.
>>>> 
>>>> Thanks in advance!
>>>> -Brendan
>>>> 
> 




More information about the petsc-dev mailing list