[petsc-users] Bus Error

Matthew Knepley knepley at gmail.com
Mon Aug 24 10:25:40 CDT 2020


On Mon, Aug 24, 2020 at 11:15 AM Mark Lohry <mlohry at gmail.com> wrote:

> Do you ever use regular malloc()? PETSc malloc aligns automatically, but
>> the system one does not.
>
>
> Indirectly via new, yes.
>

I would consider replacing those.

  Thanks,

     Matt


> On Mon, Aug 24, 2020 at 11:10 AM Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Aug 24, 2020 at 10:56 AM Mark Lohry <mlohry at gmail.com> wrote:
>>
>>> Thanks Barry, I'll give -malloc_debug a shot.
>>>
>>>   I know this is not necessarily a reasonable test but if you run the
>>>> exact same thing twice does it crash at the same location in terms of
>>>> iterations or does it seem to crash eventually "randomly" just after a long
>>>> time?
>>>>
>>>
>>> Crashes after a different number of iterations, seemingly random.
>>>
>>>
>>>>
>>>>   I understand the frustration with this kind of crash, it just
>>>> shouldn't happen because the same BLAS calls have been made in the same way
>>>> thousands of times and yet suddenly trouble and very hard to debug.
>>>>
>>>
>>> Eventually makes for a good war story.
>>>
>>> Thinking back, I have seen some disturbing memory behavior that I think
>>> falls back to my use of eigen... e.g. in the past when running my full test
>>> suite a particular case would fail with NaNs, but if I ran that case in
>>> isolation it passes. I wonder if some object isn't getting properly aligned
>>> and at some point some kind of corruption occurs?
>>>
>>
>> Do you ever use regular malloc()? PETSc malloc aligns automatically, but
>> the system one does not.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> On Mon, Aug 24, 2020 at 10:35 AM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>>   Mark,
>>>>
>>>>   Ok, I'd generally trust the stock BLAS for not failing over OpenBLAS.
>>>>
>>>>   Since valgrind is not viable have you tried with -malloc_debug with
>>>> the bad case it will be a little bit slower but not to bad and can find
>>>> some memory corruption issues.
>>>>
>>>>   It might be useful to get the stack trace inside the BLAS to see
>>>> exactly where it crashes. If you ./configure with debugging and use
>>>> --download-fblaslapack or --download-f2cblaslapack it will compile the BLAS
>>>> with debugging, but just running a batch job still won't display the stack
>>>> frames inside the BLAS call.
>>>>
>>>>   We have an option -on_error_attach_debugger which is useful for
>>>> longer many rank runs that attaches the debugger ONLY when the error is
>>>> detected but it may not play well with batch systems. But if you can make
>>>> your run on a non-batch system it might be able, along with the
>>>> --download-fblaslapack or --download-f2cblaslapack to get the exact stack
>>>> frames. And in the debugger look at the variables and address points to try
>>>> to determine how it could have gone wrong.
>>>>
>>>>   I know this is not necessarily a reasonable test but if you run the
>>>> exact same thing twice does it crash at the same location in terms of
>>>> iterations or does it seem to crash eventually "randomly" just after a long
>>>> time?
>>>>
>>>>   I understand the frustration with this kind of crash, it just
>>>> shouldn't happen because the same BLAS calls have been made in the same way
>>>> thousands of times and yet suddenly trouble and very hard to debug.
>>>>
>>>>   Barry
>>>>
>>>>
>>>>
>>>>
>>>> On Aug 24, 2020, at 9:15 AM, Mark Lohry <mlohry at gmail.com> wrote:
>>>>
>>>> valgrind: I ran a much smaller case and didn't see any issues in
>>>> valgrind. I'm only seeing this bus error on several hundred cores a few
>>>> hours wallclock in, so it might not be feasible to run that in valgrind.
>>>>
>>>> blas: i'm not entirely sure -- it's the stock one in PUIAS linux (red
>>>> hat derivative), libblas.so.3.4.2.. i'm going to try with intel and if that
>>>> fails use the openblas downloaded via petsc and see if it alleviates itself.
>>>>
>>>>
>>>>
>>>> On Mon, Aug 24, 2020 at 9:48 AM Barry Smith <bsmith at petsc.dev> wrote:
>>>>
>>>>>
>>>>>   Mark,
>>>>>
>>>>>    Can you run in valgrind?
>>>>>
>>>>>    Exactly what BLAS are you using?
>>>>>
>>>>>    Barry
>>>>>
>>>>>
>>>>> On Aug 24, 2020, at 7:54 AM, Mark Lohry <mlohry at gmail.com> wrote:
>>>>>
>>>>> Reran with debug mode and got a stack trace for this bus error, looks
>>>>> like it's happening in BLASgemv, see pasted below. I did take care of the
>>>>> ISColoring leak mentioned previously, although that was a very small amount
>>>>> of data and I don't think is relevant here.
>>>>>
>>>>> At this point it's happily run 222 timesteps prior to this, so I'm a
>>>>> little mystified. Any ideas?
>>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>>
>>>>> 222 TS dt 0.03 time 6.66
>>>>>     0 SNES Function norm 4.124287265556e+02
>>>>>       0 KSP Residual norm 4.124287265556e+02
>>>>>       1 KSP Residual norm 4.123248052318e+02
>>>>>       2 KSP Residual norm 4.123173350456e+02
>>>>>       3 KSP Residual norm 4.118769044110e+02
>>>>>       4 KSP Residual norm 4.094856150740e+02
>>>>>       5 KSP Residual norm 4.006000788078e+02
>>>>>       6 KSP Residual norm 3.787922969183e+02
>>>>> [clip]
>>>>>     Linear solve converged due to CONVERGED_RTOL iterations 9
>>>>>         Line search: Using full step: fnorm 4.015236590684e+01 gnorm
>>>>> 3.173434863784e+00
>>>>>     2 SNES Function norm 3.173434863784e+00
>>>>>   Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations
>>>>> 2
>>>>>     0 SNES Function norm 5.842010710080e+02
>>>>>       0 KSP Residual norm 5.842010710080e+02
>>>>>       1 KSP Residual norm 5.840526408234e+02
>>>>>       2 KSP Residual norm 5.840431857354e+02
>>>>>       3 KSP Residual norm 5.834351392302e+02
>>>>>       4 KSP Residual norm 5.800901047861e+02
>>>>>       5 KSP Residual norm 5.675562288567e+02
>>>>>       6 KSP Residual norm 5.366287895681e+02
>>>>>       7 KSP Residual norm 4.725811521866e+02
>>>>> [911]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> [911]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly
>>>>> illegal memory access
>>>>> [911]PETSC ERROR: Try option -start_in_debugger or
>>>>> -on_error_attach_debugger
>>>>> [911]PETSC ERROR: or see
>>>>> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>> [911]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>>>> Mac OS X to find memory corruption errors
>>>>> [911]PETSC ERROR: likely location of problem given in stack below
>>>>> [911]PETSC ERROR: ---------------------  Stack Frames
>>>>> ------------------------------------
>>>>> [911]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>> available,
>>>>> [911]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>> function
>>>>> [911]PETSC ERROR:       is given.
>>>>> [911]PETSC ERROR: [911] BLASgemv line 1393
>>>>> /home/mlohry/build/external/petsc/src/mat/impls/baij/seq/baijfact.c
>>>>> [911]PETSC ERROR: [911] MatSolve_SeqBAIJ_N_NaturalOrdering line 1378
>>>>> /home/mlohry/build/external/petsc/src/mat/impls/baij/seq/baijfact.c
>>>>> [911]PETSC ERROR: [911] MatSolve line 3354
>>>>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
>>>>> [911]PETSC ERROR: [911] PCApply_ILU line 201
>>>>> /home/mlohry/build/external/petsc/src/ksp/pc/impls/factor/ilu/ilu.c
>>>>> [911]PETSC ERROR: [911] PCApply line 426
>>>>> /home/mlohry/build/external/petsc/src/ksp/pc/interface/precon.c
>>>>> [911]PETSC ERROR: [911] KSP_PCApply line 279
>>>>> /home/mlohry/build/external/petsc/include/petsc/private/kspimpl.h
>>>>> [911]PETSC ERROR: [911] KSPSolve_PREONLY line 16
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/impls/preonly/preonly.c
>>>>> [911]PETSC ERROR: [911] KSPSolve_Private line 590
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
>>>>> [911]PETSC ERROR: [911] KSPSolve line 848
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
>>>>> [911]PETSC ERROR: [911] PCApply_ASM line 441
>>>>> /home/mlohry/build/external/petsc/src/ksp/pc/impls/asm/asm.c
>>>>> [911]PETSC ERROR: [911] PCApply line 426
>>>>> /home/mlohry/build/external/petsc/src/ksp/pc/interface/precon.c
>>>>> [911]PETSC ERROR: [911] KSP_PCApply line 279
>>>>> /home/mlohry/build/external/petsc/include/petsc/private/kspimpl.h
>>>>> [911]PETSC ERROR: [911] KSPFGMRESCycle line 108
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
>>>>> [911]PETSC ERROR: [911] KSPSolve_FGMRES line 274
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
>>>>> [911]PETSC ERROR: [911] KSPSolve_Private line 590
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
>>>>> [911]PETSC ERROR: [911] KSPSolve line 848
>>>>> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
>>>>> [911]PETSC ERROR: [911] SNESSolve_NEWTONLS line 144
>>>>> /home/mlohry/build/external/petsc/src/snes/impls/ls/ls.c
>>>>> [911]PETSC ERROR: [911] SNESSolve line 4403
>>>>> /home/mlohry/build/external/petsc/src/snes/interface/snes.c
>>>>> [911]PETSC ERROR: [911] TSStep_ARKIMEX line 728
>>>>> /home/mlohry/build/external/petsc/src/ts/impls/arkimex/arkimex.c
>>>>> [911]PETSC ERROR: [911] TSStep line 3682
>>>>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
>>>>> [911]PETSC ERROR: [911] TSSolve line 4005
>>>>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
>>>>> [911]PETSC ERROR: --------------------- Error Message
>>>>> --------------------------------------------------------------
>>>>> [911]PETSC ERROR: Signal received
>>>>> [911]PETSC ERROR: See
>>>>> https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>>> shooting.
>>>>> [911]PETSC ERROR: Petsc Release Version 3.13.3, Jul 01, 2020
>>>>> [911]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n20 by
>>>>> mlohry Sun Aug 23 19:54:21 2020
>>>>> [911]PETSC ERROR: Configure options
>>>>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
>>>>> --with-cc=/usr/local/openmpi/3.1.3/gcc/x8
>>>>> [911]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>> MPI_ABORT was invoked on rank 911 in communicator MPI_COMM_WORLD
>>>>>
>>>>> On Wed, Aug 12, 2020 at 8:19 PM Mark Lohry <mlohry at gmail.com> wrote:
>>>>>
>>>>>>    Perhaps you are calling ISColoringGetIS() and not calling
>>>>>>> ISColoringRestoreIS()?
>>>>>>>
>>>>>>
>>>>>> I have matching ISColoringGet/Restore here, and it's only used prior
>>>>>> to the first iteration so at least it doesn't seem to be growing. At the
>>>>>> bottom I pasted the malloc_view and malloc_debug output from running 1 time
>>>>>> step.
>>>>>>
>>>>>> I'm sort of thinking this might be a red herring -- is it possible
>>>>>> the rank 0 process is chewing up dramatically more memory than others, like
>>>>>> with logging or something? Like I mentioned earlier the total memory usage
>>>>>> is well under the machine limits. I'll spring in some
>>>>>> PetscMemoryGetMaximumUsage logging at every time step and try to get a big
>>>>>> job going again.
>>>>>>
>>>>>>
>>>>>>
>>>>>>    Are you using Fortran?
>>>>>>>
>>>>>>
>>>>>> C++
>>>>>>
>>>>>>
>>>>>>
>>>>>> [ 0]1408 bytes PetscSplitReductionCreate() line 63 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>>>>>> [ 0]80 bytes PetscSplitReductionCreate() line 57 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>>>>>> [ 0]16 bytes PetscCommBuildTwoSided_Allreduce() line 169 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/mpits.c
>>>>>> [ 0]16 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]272 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]880 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]960 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]976 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]1040 bytes ISGeneralSetIndices_General() line 578 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>> [ 0]64 bytes ISColoringGetIS() line 266 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/utils/iscoloring.c
>>>>>> [ 0]32 bytes PetscCommDuplicate() line 129 in
>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/objects/tagm.c
>>>>>> [0] Maximum memory PetscMalloc()ed 610153776 maximum size of entire
>>>>>> process 719073280
>>>>>> [0] Memory usage sorted by function
>>>>>> [0] 6 192 DMCoarsenHookAdd()
>>>>>> [0] 2 9984 DMCreate()
>>>>>> [0] 2 128 DMCreate_Shell()
>>>>>> [0] 2 64 DMDSEnlarge_Static()
>>>>>> [0] 1 672 DMKSPCreate()
>>>>>> [0] 3 96 DMRefineHookAdd()
>>>>>> [0] 3 2064 DMSNESCreate()
>>>>>> [0] 4 128 DMSubDomainHookAdd()
>>>>>> [0] 1 768 DMTSCreate()
>>>>>> [0] 2 96 ISColoringCreate()
>>>>>> [0] 8 12608 ISColoringGetIS()
>>>>>> [0] 1 307200 ISConcatenate()
>>>>>> [0] 29 25984 ISCreate()
>>>>>> [0] 25 400 ISCreate_General()
>>>>>> [0] 4 64 ISCreate_Stride()
>>>>>> [0] 20 338016 ISGeneralSetIndices_General()
>>>>>> [0] 3 921600 ISGetIndices_Stride()
>>>>>> [0] 2 307232 ISGlobalToLocalMappingSetUp_Basic()
>>>>>> [0] 1 6144 ISInvertPermutation_General()
>>>>>> [0] 3 308576 ISLocalToGlobalMappingCreate()
>>>>>> [0] 2 32 KSPConvergedDefaultCreate()
>>>>>> [0] 2 2816 KSPCreate()
>>>>>> [0] 1 224 KSPCreate_FGMRES()
>>>>>> [0] 1 8016 KSPGMRESClassicalGramSchmidtOrthogonalization()
>>>>>> [0] 2 16032 KSPSetUp_FGMRES()
>>>>>> [0] 4 16084160 KSPSetUp_GMRES()
>>>>>> [0] 2 36864 MatColoringApply_SL()
>>>>>> [0] 1 656 MatColoringCreate()
>>>>>> [0] 6 17088 MatCreate()
>>>>>> [0] 1 16 MatCreateMFFD_WP()
>>>>>> [0] 1 16 MatCreateSubMatrices_SeqBAIJ()
>>>>>> [0] 1 12288 MatCreateSubMatrix_SeqBAIJ()
>>>>>> [0] 3 32320 MatCreateSubMatrix_SeqBAIJ_Private()
>>>>>> [0] 2 1472 MatCreate_MFFD()
>>>>>> [0] 1 416 MatCreate_SeqAIJ()
>>>>>> [0] 3 864 MatCreate_SeqBAIJ()
>>>>>> [0] 2 416 MatCreate_Shell()
>>>>>> [0] 1 784 MatFDColoringCreate()
>>>>>> [0] 2 12288 MatFDColoringDegreeSequence_Minpack()
>>>>>> [0] 6 30859392 MatFDColoringSetUp_SeqXAIJ()
>>>>>> [0] 3 42512 MatGetColumnIJ_SeqAIJ()
>>>>>> [0] 4 72720 MatGetColumnIJ_SeqBAIJ_Color()
>>>>>> [0] 1 6144 MatGetOrdering_Natural()
>>>>>> [0] 2 36384 MatGetRowIJ_SeqAIJ()
>>>>>> [0] 7 210626000 MatILUFactorSymbolic_SeqBAIJ()
>>>>>> [0] 2 313376 MatIncreaseOverlap_SeqBAIJ()
>>>>>> [0] 2 30740608 MatLUFactorNumeric_SeqBAIJ_N()
>>>>>> [0] 1 6144 MatMarkDiagonal_SeqAIJ()
>>>>>> [0] 1 6144 MatMarkDiagonal_SeqBAIJ()
>>>>>> [0] 8 256 MatRegisterRootName()
>>>>>> [0] 1 6160 MatSeqAIJCheckInode()
>>>>>> [0] 4 115216 MatSeqAIJSetPreallocation_SeqAIJ()
>>>>>> [0] 4 302779424 MatSeqBAIJSetPreallocation_SeqBAIJ()
>>>>>> [0] 13 576 MatSolverTypeRegister()
>>>>>> [0] 1 16 PCASMCreateSubdomains()
>>>>>> [0] 2 1664 PCCreate()
>>>>>> [0] 1 160 PCCreate_ASM()
>>>>>> [0] 1 192 PCCreate_ILU()
>>>>>> [0] 5 307264 PCSetUp_ASM()
>>>>>> [0] 2 416 PetscBTCreate()
>>>>>> [0] 2 3216 PetscClassPerfLogCreate()
>>>>>> [0] 2 1616 PetscClassRegLogCreate()
>>>>>> [0] 2 32 PetscCommBuildTwoSided_Allreduce()
>>>>>> [0] 2 64 PetscCommDuplicate()
>>>>>> [0] 2 1888 PetscDSCreate()
>>>>>> [0] 2 26416 PetscEventPerfLogCreate()
>>>>>> [0] 2 158400 PetscEventPerfLogEnsureSize()
>>>>>> [0] 2 1616 PetscEventRegLogCreate()
>>>>>> [0] 2 9600 PetscEventRegLogRegister()
>>>>>> [0] 8 102400 PetscFreeSpaceGet()
>>>>>> [0] 474 15168 PetscFunctionListAdd_Private()
>>>>>> [0] 2 528 PetscIntStackCreate()
>>>>>> [0] 142 11360 PetscLayoutCreate()
>>>>>> [0] 56 896 PetscLayoutSetUp()
>>>>>> [0] 59 9440 PetscObjectComposedDataIncreaseReal()
>>>>>> [0] 2 576 PetscObjectListAdd()
>>>>>> [0] 33 768 PetscOptionsGetEList()
>>>>>> [0] 1 16 PetscOptionsHelpPrintedCreate()
>>>>>> [0] 1 32 PetscPushSignalHandler()
>>>>>> [0] 7 6944 PetscSFCreate()
>>>>>> [0] 3 432 PetscSFCreate_Basic()
>>>>>> [0] 2 1472 PetscSFLinkCreate()
>>>>>> [0] 11 1229040 PetscSFSetUpRanks()
>>>>>> [0] 7 614512 PetscSFSetUp_Basic()
>>>>>> [0] 4 20096 PetscSegBufferCreate()
>>>>>> [0] 2 1488 PetscSplitReductionCreate()
>>>>>> [0] 2 3008 PetscStageLogCreate()
>>>>>> [0] 1148 23872 PetscStrallocpy()
>>>>>> [0] 6 13056 PetscStrreplace()
>>>>>> [0] 9 3456 PetscTableCreate()
>>>>>> [0] 1 16 PetscViewerASCIIOpen()
>>>>>> [0] 6 96 PetscViewerAndFormatCreate()
>>>>>> [0] 1 752 PetscViewerCreate()
>>>>>> [0] 1 96 PetscViewerCreate_ASCII()
>>>>>> [0] 2 1424 SNESCreate()
>>>>>> [0] 1 16 SNESCreate_NEWTONLS()
>>>>>> [0] 1 1008 SNESLineSearchCreate()
>>>>>> [0] 1 16 SNESLineSearchCreate_BT()
>>>>>> [0] 16 1824 SNESMSRegister()
>>>>>> [0] 46 9056 TSARKIMEXRegister()
>>>>>> [0] 1 1264 TSAdaptCreate()
>>>>>> [0] 8 384 TSBasicSymplecticRegister()
>>>>>> [0] 1 2160 TSCreate()
>>>>>> [0] 1 224 TSCreate_Theta()
>>>>>> [0] 48 5968 TSGLEERegister()
>>>>>> [0] 41 7728 TSRKRegister()
>>>>>> [0] 89 14736 TSRosWRegister()
>>>>>> [0] 71 110192 VecCreate()
>>>>>> [0] 1 307200 VecCreateGhostWithArray()
>>>>>> [0] 123 36874080 VecCreate_MPI_Private()
>>>>>> [0] 7 4300800 VecCreate_Seq()
>>>>>> [0] 8 256 VecCreate_Seq_Private()
>>>>>> [0] 6 400 VecDuplicateVecs_Default()
>>>>>> [0] 3 2352 VecScatterCreate()
>>>>>> [0] 7 1843296 VecScatterSetUp_SF()
>>>>>> [0] 126 2016 VecStashCreate_Private()
>>>>>> [0] 1 3072 mapBlockColoringToJacobian()
>>>>>>
>>>>>> On Wed, Aug 12, 2020 at 4:22 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>
>>>>>>>
>>>>>>>    Yes, there are some PETSc objects or arrays that you are not
>>>>>>> freeing so they are printed at the end of the run. For small runs this
>>>>>>> harmless but if new objects/memory is allocated at each iteration and not
>>>>>>> suitably freed it will eventually add up.
>>>>>>>
>>>>>>>     Run with -malloc_view (small problem with say 2 iterations) it
>>>>>>> will print everything allocated and might be helpful.
>>>>>>>
>>>>>>>    Perhaps you are calling ISColoringGetIS() and not calling
>>>>>>> ISColoringRestoreIS()?
>>>>>>>
>>>>>>>    It is also possible it is a leak in PETSc, but that is unlikely
>>>>>>> since we test for them.
>>>>>>>
>>>>>>>    Are you using Fortran?
>>>>>>>
>>>>>>>   Barry
>>>>>>>
>>>>>>>
>>>>>>> On Aug 12, 2020, at 1:29 PM, Mark Lohry <mlohry at gmail.com> wrote:
>>>>>>>
>>>>>>> Thanks Matt and Barry. At Matt's suggestion I ran a smaller
>>>>>>> representative case with valgrind and didn't see anything alarming (apart
>>>>>>> from a small leak in an older boost version I was using:
>>>>>>> https://github.com/boostorg/serialization/issues/104  although I
>>>>>>> don't think this was causing the issue).
>>>>>>>
>>>>>>> -malloc_debug dumps quite a lot, this is supposed to be empty right?
>>>>>>> Output pasted below. It looks like the same sequence of calls is repeated 8
>>>>>>> times, which is how many nonlinear solves occurred in this particular run.
>>>>>>> Thoughts?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [ 0]1408 bytes PetscSplitReductionCreate() line 63 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>>>>>>> [ 0]80 bytes PetscSplitReductionCreate() line 57 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>>>>>>> [ 0]16 bytes PetscCommBuildTwoSided_Allreduce() line 169 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/mpits.c
>>>>>>> [ 0]16 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]272 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]880 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]960 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]976 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]1040 bytes ISGeneralSetIndices_General() line 578 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>>>>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>>>>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>>>>>> [ 0]16 bytes ISCreate_General() line 647 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>>>>>> [ 0]896 bytes ISCreate() line 37 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>>>>>> [ 0]64 bytes ISColoringGetIS() line 266 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/utils/iscoloring.c
>>>>>>> [ 0]32 bytes PetscCommDuplicate() line 129 in
>>>>>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/objects/tagm.c
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 12, 2020 at 1:46 PM Barry Smith <bsmith at petsc.dev>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>    Mark.
>>>>>>>>
>>>>>>>>     When valgrind is not feasible (like on many centrally
>>>>>>>> controlled batch systems) you can run PETSc with an extra flag to do some
>>>>>>>> memory error checks
>>>>>>>>  -malloc_debug
>>>>>>>>
>>>>>>>>  this
>>>>>>>>
>>>>>>>> 1) fills all malloced memory with Nan so if the code is using
>>>>>>>> uninitialized memory it may be detected and
>>>>>>>> 2) checks the beginning and end of each alloced memory region for
>>>>>>>> out-of-bounds writes at each malloc and free.
>>>>>>>>
>>>>>>>> it will slow the code down a little bit but generally not a huge
>>>>>>>> amount.
>>>>>>>>
>>>>>>>> It is no where near as good as valgrind or other memory corruption
>>>>>>>> tools but it has the advantage you can run it anywhere on any size job.
>>>>>>>>
>>>>>>>>
>>>>>>>>   Barry
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Aug 12, 2020, at 7:46 AM, Matthew Knepley <knepley at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Wed, Aug 12, 2020 at 7:53 AM Mark Lohry <mlohry at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm getting seemingly random failures of late:
>>>>>>>>> Caught signal number 7 BUS: Bus Error, possibly illegal memory
>>>>>>>>> access
>>>>>>>>>
>>>>>>>>
>>>>>>>> The first thing I would do is run valgrind on as wide an array of
>>>>>>>> tests as you can. This will find problems
>>>>>>>> on things that run completely fine.
>>>>>>>>
>>>>>>>>   Thanks,
>>>>>>>>
>>>>>>>>      Matt
>>>>>>>>
>>>>>>>>
>>>>>>>>> Symptoms:
>>>>>>>>> 1) Seems to only happen (so far) on larger cases, 400-2000 cores
>>>>>>>>> 2) It doesn't happen right away -- this was running happily for
>>>>>>>>> several hours over several hundred time steps with no indication of bad
>>>>>>>>> health in the numerics
>>>>>>>>> 3) At least the total memory consumption seems to be within
>>>>>>>>> bounds, though I'm not sure about individual processes. e.g. slurm here
>>>>>>>>> reported Memory Efficiency: 75.23% of 1.76 TB (180.00 GB/node)
>>>>>>>>> 4) running the same setup twice it fails at different points
>>>>>>>>>
>>>>>>>>> Any suggestions on what to look for? This is a bit painful to work
>>>>>>>>> on as I can only reproduce it on large runs and then it's seemingly random.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Mark
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200824/b7335e3b/attachment-0001.html>


More information about the petsc-users mailing list