malloc(): memory corruption:

Dominik Szczerba dominik at itis.ethz.ch
Sun Nov 15 02:24:49 CST 2009


Yes, I have found an error in my matrix...
Thank you all for the useful hints!
Still, I wonder if there are some more efficient ways to set up bug 
traps to get get the backtrace leading to the real problem and not to 
the innocent parts... <sigh/>.

With regards,
Dominik

Barry Smith wrote:
>    If you run without hypre preconditioner but use instead, say  
> bjacobi under valgrind do you get any valgrind errors?
> 
>     The problem you are having could be do to (1) some memory  
> corruption in your code that is messing up hypre or (2) some bug in  
> hypre that we don't see with our simple test codes.
> 
>     Barry
> 
> On Nov 14, 2009, at 4:41 PM, Dominik Szczerba wrote:
> 
>> No I am using Hypre built automatically along with petsc...
>> I will try ex10, thanks...
>>
>> Matthew Knepley wrote:
>>> This is already bad. You had an Invalid Read and Invalid Write in  
>>> your Hypre. Did you build it
>>> yourself? If so, let us build it. If not, please try your matrix on  
>>> KSP ex10 and see if you get a
>>> crash on 2 procs.
>>>  Thanks,
>>>    Matt
>>> On Sat, Nov 14, 2009 at 3:51 PM, Dominik Szczerba <dominik at itis.ethz.ch 
>>>  <mailto:dominik at itis.ethz.ch>> wrote:
>>>    run onlu in single, he says things like below - but does not  
>>> crash.
>>>    Also, the program run with -np 1 does not crash. No clear idea
>>>    though about valgrind's output, please advise if this tells you
>>>    anything...
>>>    Call from NS3T10::createSolverContexts() referenced therein is:
>>>    ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr);
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x8AE720F: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2115)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x8AE7244: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2120)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x4025C16: strcpy (mc_replace_strmem.c:303)
>>>    ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2123)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x4025C35: strcpy (mc_replace_strmem.c:303)
>>>    ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2123)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    Solver contexts created in 2.520000 s
>>>    Starting KSPSolve (0/1)
>>>      0 KSP Residual norm 8.368803253774e-06
>>>    ==2605== Invalid read of size 8
>>>    ==2605==    at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c: 
>>> 223)
>>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>>> 630)
>>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>>> (HYPRE_parcsr_amg.c:58)
>>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>>>    ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>>>    ==2605==  Address 0xafae5d0 is 0 bytes after a block of size  
>>> 93,488
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>>> (csr_matrix.c:91)
>>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>>    (par_csr_matrix.c:200)
>>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>>    (IJMatrix_parcsr.c:272)
>>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>>    (HYPRE_IJMatrix.c:302)
>>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>>> (mhyp.c:174)
>>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==
>>>    ==2605== Invalid write of size 4
>>>    ==2605==    at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c: 
>>> 301)
>>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>>> 630)
>>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>>> (HYPRE_parcsr_amg.c:58)
>>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>>>    ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>>>    ==2605==  Address 0xb12a050 is 0 bytes after a block of size  
>>> 46,744
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c: 
>>> 163)
>>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>>> 630)
>>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>>> (HYPRE_parcsr_amg.c:58)
>>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==
>>>    ...
>>>    ==2605== Invalid read of size 8
>>>    ==2605==    at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182)
>>>    ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>>>    (par_relax_interface.c:110)
>>>    ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>>>    ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: 
>>> 252)
>>>    ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve  
>>> (HYPRE_parcsr_amg.c:76)
>>>    ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>>>    ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==  Address 0xafae5d0 is 0 bytes after a block of size  
>>> 93,488
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>>> (csr_matrix.c:91)
>>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>>    (par_csr_matrix.c:200)
>>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>>    (IJMatrix_parcsr.c:272)
>>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>>    (HYPRE_IJMatrix.c:302)
>>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>>> (mhyp.c:174)
>>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==
>>>    ...
>>>      0 KSP Residual norm 8.368803253774e-06
>>>    ==2605== Invalid read of size 8
>>>    ==2605==    at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196)
>>>    ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>>>    (par_relax_interface.c:110)
>>>    ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>>>    ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: 
>>> 252)
>>>    ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve  
>>> (HYPRE_parcsr_amg.c:76)
>>>    ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>>>    ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==  Address 0xcded820 is 0 bytes after a block of size  
>>> 93,488
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>>> (csr_matrix.c:91)
>>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>>    (par_csr_matrix.c:200)
>>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>>    (IJMatrix_parcsr.c:272)
>>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>>    (HYPRE_IJMatrix.c:302)
>>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>>> (mhyp.c:174)
>>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==
>>>    Matthew Knepley wrote:
>>>        Try valgrind.
>>>         Matt
>>>        On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba
>>>        <dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>
>>>        <mailto:dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>>>  
>>> wrote:
>>>           Now for something more serious: I get a crash like this  
>>> one:
>>>           Starting KSPSolve (1/2)
>>>            0 KSP Residual norm 2.964538623545e-06
>>>           *** glibc detected *** /home/domel/build/solve-debug/ 
>>> ns3t10mpi:
>>>           malloc(): memory corruption: 0x09258008 ***
>>>           ======= Backtrace: =========
>>>           /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
>>>           /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
>>>           /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
>>>                  /home/domel/build/solve-debug/ 
>>> ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>>>                  /home/domel/build/solve-debug/ 
>>> ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
>>>           (and so on)
>>>           gdb invoked as:
>>>           mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
>>>           does not display any backtrace after the crash.
>>>           Any hints how to debug are highly appreciated.
>>>           Dominik
>>>        --         What most experimenters take for granted before  
>>> they begin their
>>>        experiments is infinitely more interesting than any results to
>>>        which their experiments lead.
>>>        -- Norbert Wiener
>>> -- 
>>> What most experimenters take for granted before they begin their  
>>> experiments is infinitely more interesting than any results to  
>>> which their experiments lead.
>>> -- Norbert Wiener
> 



More information about the petsc-users mailing list