[petsc-users] Bus Error
Mark Lohry
mlohry at gmail.com
Mon Aug 24 09:15:25 CDT 2020
valgrind: I ran a much smaller case and didn't see any issues in valgrind.
I'm only seeing this bus error on several hundred cores a few hours
wallclock in, so it might not be feasible to run that in valgrind.
blas: i'm not entirely sure -- it's the stock one in PUIAS linux (red hat
derivative), libblas.so.3.4.2.. i'm going to try with intel and if that
fails use the openblas downloaded via petsc and see if it alleviates itself.
On Mon, Aug 24, 2020 at 9:48 AM Barry Smith <bsmith at petsc.dev> wrote:
>
> Mark,
>
> Can you run in valgrind?
>
> Exactly what BLAS are you using?
>
> Barry
>
>
> On Aug 24, 2020, at 7:54 AM, Mark Lohry <mlohry at gmail.com> wrote:
>
> Reran with debug mode and got a stack trace for this bus error, looks like
> it's happening in BLASgemv, see pasted below. I did take care of the
> ISColoring leak mentioned previously, although that was a very small amount
> of data and I don't think is relevant here.
>
> At this point it's happily run 222 timesteps prior to this, so I'm a
> little mystified. Any ideas?
>
> Thanks,
> Mark
>
>
> 222 TS dt 0.03 time 6.66
> 0 SNES Function norm 4.124287265556e+02
> 0 KSP Residual norm 4.124287265556e+02
> 1 KSP Residual norm 4.123248052318e+02
> 2 KSP Residual norm 4.123173350456e+02
> 3 KSP Residual norm 4.118769044110e+02
> 4 KSP Residual norm 4.094856150740e+02
> 5 KSP Residual norm 4.006000788078e+02
> 6 KSP Residual norm 3.787922969183e+02
> [clip]
> Linear solve converged due to CONVERGED_RTOL iterations 9
> Line search: Using full step: fnorm 4.015236590684e+01 gnorm
> 3.173434863784e+00
> 2 SNES Function norm 3.173434863784e+00
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> 0 SNES Function norm 5.842010710080e+02
> 0 KSP Residual norm 5.842010710080e+02
> 1 KSP Residual norm 5.840526408234e+02
> 2 KSP Residual norm 5.840431857354e+02
> 3 KSP Residual norm 5.834351392302e+02
> 4 KSP Residual norm 5.800901047861e+02
> 5 KSP Residual norm 5.675562288567e+02
> 6 KSP Residual norm 5.366287895681e+02
> 7 KSP Residual norm 4.725811521866e+02
> [911]PETSC ERROR:
> ------------------------------------------------------------------------
> [911]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal
> memory access
> [911]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> [911]PETSC ERROR: or see
> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [911]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> [911]PETSC ERROR: likely location of problem given in stack below
> [911]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> [911]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> [911]PETSC ERROR: INSTEAD the line number of the start of the
> function
> [911]PETSC ERROR: is given.
> [911]PETSC ERROR: [911] BLASgemv line 1393
> /home/mlohry/build/external/petsc/src/mat/impls/baij/seq/baijfact.c
> [911]PETSC ERROR: [911] MatSolve_SeqBAIJ_N_NaturalOrdering line 1378
> /home/mlohry/build/external/petsc/src/mat/impls/baij/seq/baijfact.c
> [911]PETSC ERROR: [911] MatSolve line 3354
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> [911]PETSC ERROR: [911] PCApply_ILU line 201
> /home/mlohry/build/external/petsc/src/ksp/pc/impls/factor/ilu/ilu.c
> [911]PETSC ERROR: [911] PCApply line 426
> /home/mlohry/build/external/petsc/src/ksp/pc/interface/precon.c
> [911]PETSC ERROR: [911] KSP_PCApply line 279
> /home/mlohry/build/external/petsc/include/petsc/private/kspimpl.h
> [911]PETSC ERROR: [911] KSPSolve_PREONLY line 16
> /home/mlohry/build/external/petsc/src/ksp/ksp/impls/preonly/preonly.c
> [911]PETSC ERROR: [911] KSPSolve_Private line 590
> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
> [911]PETSC ERROR: [911] KSPSolve line 848
> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
> [911]PETSC ERROR: [911] PCApply_ASM line 441
> /home/mlohry/build/external/petsc/src/ksp/pc/impls/asm/asm.c
> [911]PETSC ERROR: [911] PCApply line 426
> /home/mlohry/build/external/petsc/src/ksp/pc/interface/precon.c
> [911]PETSC ERROR: [911] KSP_PCApply line 279
> /home/mlohry/build/external/petsc/include/petsc/private/kspimpl.h
> [911]PETSC ERROR: [911] KSPFGMRESCycle line 108
> /home/mlohry/build/external/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> [911]PETSC ERROR: [911] KSPSolve_FGMRES line 274
> /home/mlohry/build/external/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> [911]PETSC ERROR: [911] KSPSolve_Private line 590
> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
> [911]PETSC ERROR: [911] KSPSolve line 848
> /home/mlohry/build/external/petsc/src/ksp/ksp/interface/itfunc.c
> [911]PETSC ERROR: [911] SNESSolve_NEWTONLS line 144
> /home/mlohry/build/external/petsc/src/snes/impls/ls/ls.c
> [911]PETSC ERROR: [911] SNESSolve line 4403
> /home/mlohry/build/external/petsc/src/snes/interface/snes.c
> [911]PETSC ERROR: [911] TSStep_ARKIMEX line 728
> /home/mlohry/build/external/petsc/src/ts/impls/arkimex/arkimex.c
> [911]PETSC ERROR: [911] TSStep line 3682
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> [911]PETSC ERROR: [911] TSSolve line 4005
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> [911]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [911]PETSC ERROR: Signal received
> [911]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [911]PETSC ERROR: Petsc Release Version 3.13.3, Jul 01, 2020
> [911]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n20 by
> mlohry Sun Aug 23 19:54:21 2020
> [911]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/usr/local/openmpi/3.1.3/gcc/x8
> [911]PETSC ERROR: #1 User provided function() line 0 in unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 911 in communicator MPI_COMM_WORLD
>
> On Wed, Aug 12, 2020 at 8:19 PM Mark Lohry <mlohry at gmail.com> wrote:
>
>> Perhaps you are calling ISColoringGetIS() and not calling
>>> ISColoringRestoreIS()?
>>>
>>
>> I have matching ISColoringGet/Restore here, and it's only used prior to
>> the first iteration so at least it doesn't seem to be growing. At the
>> bottom I pasted the malloc_view and malloc_debug output from running 1 time
>> step.
>>
>> I'm sort of thinking this might be a red herring -- is it possible the
>> rank 0 process is chewing up dramatically more memory than others, like
>> with logging or something? Like I mentioned earlier the total memory usage
>> is well under the machine limits. I'll spring in some
>> PetscMemoryGetMaximumUsage logging at every time step and try to get a big
>> job going again.
>>
>>
>>
>> Are you using Fortran?
>>>
>>
>> C++
>>
>>
>>
>> [ 0]1408 bytes PetscSplitReductionCreate() line 63 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>> [ 0]80 bytes PetscSplitReductionCreate() line 57 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>> [ 0]16 bytes PetscCommBuildTwoSided_Allreduce() line 169 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/mpits.c
>> [ 0]16 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]272 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]880 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]960 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]976 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]1040 bytes ISGeneralSetIndices_General() line 578 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>> [ 0]16 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]32 bytes PetscStrallocpy() line 187 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>> [ 0]16 bytes ISCreate_General() line 647 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>> [ 0]896 bytes ISCreate() line 37 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>> [ 0]64 bytes ISColoringGetIS() line 266 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/utils/iscoloring.c
>> [ 0]32 bytes PetscCommDuplicate() line 129 in
>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/objects/tagm.c
>> [0] Maximum memory PetscMalloc()ed 610153776 maximum size of entire
>> process 719073280
>> [0] Memory usage sorted by function
>> [0] 6 192 DMCoarsenHookAdd()
>> [0] 2 9984 DMCreate()
>> [0] 2 128 DMCreate_Shell()
>> [0] 2 64 DMDSEnlarge_Static()
>> [0] 1 672 DMKSPCreate()
>> [0] 3 96 DMRefineHookAdd()
>> [0] 3 2064 DMSNESCreate()
>> [0] 4 128 DMSubDomainHookAdd()
>> [0] 1 768 DMTSCreate()
>> [0] 2 96 ISColoringCreate()
>> [0] 8 12608 ISColoringGetIS()
>> [0] 1 307200 ISConcatenate()
>> [0] 29 25984 ISCreate()
>> [0] 25 400 ISCreate_General()
>> [0] 4 64 ISCreate_Stride()
>> [0] 20 338016 ISGeneralSetIndices_General()
>> [0] 3 921600 ISGetIndices_Stride()
>> [0] 2 307232 ISGlobalToLocalMappingSetUp_Basic()
>> [0] 1 6144 ISInvertPermutation_General()
>> [0] 3 308576 ISLocalToGlobalMappingCreate()
>> [0] 2 32 KSPConvergedDefaultCreate()
>> [0] 2 2816 KSPCreate()
>> [0] 1 224 KSPCreate_FGMRES()
>> [0] 1 8016 KSPGMRESClassicalGramSchmidtOrthogonalization()
>> [0] 2 16032 KSPSetUp_FGMRES()
>> [0] 4 16084160 KSPSetUp_GMRES()
>> [0] 2 36864 MatColoringApply_SL()
>> [0] 1 656 MatColoringCreate()
>> [0] 6 17088 MatCreate()
>> [0] 1 16 MatCreateMFFD_WP()
>> [0] 1 16 MatCreateSubMatrices_SeqBAIJ()
>> [0] 1 12288 MatCreateSubMatrix_SeqBAIJ()
>> [0] 3 32320 MatCreateSubMatrix_SeqBAIJ_Private()
>> [0] 2 1472 MatCreate_MFFD()
>> [0] 1 416 MatCreate_SeqAIJ()
>> [0] 3 864 MatCreate_SeqBAIJ()
>> [0] 2 416 MatCreate_Shell()
>> [0] 1 784 MatFDColoringCreate()
>> [0] 2 12288 MatFDColoringDegreeSequence_Minpack()
>> [0] 6 30859392 MatFDColoringSetUp_SeqXAIJ()
>> [0] 3 42512 MatGetColumnIJ_SeqAIJ()
>> [0] 4 72720 MatGetColumnIJ_SeqBAIJ_Color()
>> [0] 1 6144 MatGetOrdering_Natural()
>> [0] 2 36384 MatGetRowIJ_SeqAIJ()
>> [0] 7 210626000 MatILUFactorSymbolic_SeqBAIJ()
>> [0] 2 313376 MatIncreaseOverlap_SeqBAIJ()
>> [0] 2 30740608 MatLUFactorNumeric_SeqBAIJ_N()
>> [0] 1 6144 MatMarkDiagonal_SeqAIJ()
>> [0] 1 6144 MatMarkDiagonal_SeqBAIJ()
>> [0] 8 256 MatRegisterRootName()
>> [0] 1 6160 MatSeqAIJCheckInode()
>> [0] 4 115216 MatSeqAIJSetPreallocation_SeqAIJ()
>> [0] 4 302779424 MatSeqBAIJSetPreallocation_SeqBAIJ()
>> [0] 13 576 MatSolverTypeRegister()
>> [0] 1 16 PCASMCreateSubdomains()
>> [0] 2 1664 PCCreate()
>> [0] 1 160 PCCreate_ASM()
>> [0] 1 192 PCCreate_ILU()
>> [0] 5 307264 PCSetUp_ASM()
>> [0] 2 416 PetscBTCreate()
>> [0] 2 3216 PetscClassPerfLogCreate()
>> [0] 2 1616 PetscClassRegLogCreate()
>> [0] 2 32 PetscCommBuildTwoSided_Allreduce()
>> [0] 2 64 PetscCommDuplicate()
>> [0] 2 1888 PetscDSCreate()
>> [0] 2 26416 PetscEventPerfLogCreate()
>> [0] 2 158400 PetscEventPerfLogEnsureSize()
>> [0] 2 1616 PetscEventRegLogCreate()
>> [0] 2 9600 PetscEventRegLogRegister()
>> [0] 8 102400 PetscFreeSpaceGet()
>> [0] 474 15168 PetscFunctionListAdd_Private()
>> [0] 2 528 PetscIntStackCreate()
>> [0] 142 11360 PetscLayoutCreate()
>> [0] 56 896 PetscLayoutSetUp()
>> [0] 59 9440 PetscObjectComposedDataIncreaseReal()
>> [0] 2 576 PetscObjectListAdd()
>> [0] 33 768 PetscOptionsGetEList()
>> [0] 1 16 PetscOptionsHelpPrintedCreate()
>> [0] 1 32 PetscPushSignalHandler()
>> [0] 7 6944 PetscSFCreate()
>> [0] 3 432 PetscSFCreate_Basic()
>> [0] 2 1472 PetscSFLinkCreate()
>> [0] 11 1229040 PetscSFSetUpRanks()
>> [0] 7 614512 PetscSFSetUp_Basic()
>> [0] 4 20096 PetscSegBufferCreate()
>> [0] 2 1488 PetscSplitReductionCreate()
>> [0] 2 3008 PetscStageLogCreate()
>> [0] 1148 23872 PetscStrallocpy()
>> [0] 6 13056 PetscStrreplace()
>> [0] 9 3456 PetscTableCreate()
>> [0] 1 16 PetscViewerASCIIOpen()
>> [0] 6 96 PetscViewerAndFormatCreate()
>> [0] 1 752 PetscViewerCreate()
>> [0] 1 96 PetscViewerCreate_ASCII()
>> [0] 2 1424 SNESCreate()
>> [0] 1 16 SNESCreate_NEWTONLS()
>> [0] 1 1008 SNESLineSearchCreate()
>> [0] 1 16 SNESLineSearchCreate_BT()
>> [0] 16 1824 SNESMSRegister()
>> [0] 46 9056 TSARKIMEXRegister()
>> [0] 1 1264 TSAdaptCreate()
>> [0] 8 384 TSBasicSymplecticRegister()
>> [0] 1 2160 TSCreate()
>> [0] 1 224 TSCreate_Theta()
>> [0] 48 5968 TSGLEERegister()
>> [0] 41 7728 TSRKRegister()
>> [0] 89 14736 TSRosWRegister()
>> [0] 71 110192 VecCreate()
>> [0] 1 307200 VecCreateGhostWithArray()
>> [0] 123 36874080 VecCreate_MPI_Private()
>> [0] 7 4300800 VecCreate_Seq()
>> [0] 8 256 VecCreate_Seq_Private()
>> [0] 6 400 VecDuplicateVecs_Default()
>> [0] 3 2352 VecScatterCreate()
>> [0] 7 1843296 VecScatterSetUp_SF()
>> [0] 126 2016 VecStashCreate_Private()
>> [0] 1 3072 mapBlockColoringToJacobian()
>>
>> On Wed, Aug 12, 2020 at 4:22 PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>> Yes, there are some PETSc objects or arrays that you are not freeing
>>> so they are printed at the end of the run. For small runs this harmless but
>>> if new objects/memory is allocated at each iteration and not suitably freed
>>> it will eventually add up.
>>>
>>> Run with -malloc_view (small problem with say 2 iterations) it will
>>> print everything allocated and might be helpful.
>>>
>>> Perhaps you are calling ISColoringGetIS() and not calling
>>> ISColoringRestoreIS()?
>>>
>>> It is also possible it is a leak in PETSc, but that is unlikely since
>>> we test for them.
>>>
>>> Are you using Fortran?
>>>
>>> Barry
>>>
>>>
>>> On Aug 12, 2020, at 1:29 PM, Mark Lohry <mlohry at gmail.com> wrote:
>>>
>>> Thanks Matt and Barry. At Matt's suggestion I ran a smaller
>>> representative case with valgrind and didn't see anything alarming (apart
>>> from a small leak in an older boost version I was using:
>>> https://github.com/boostorg/serialization/issues/104 although I don't
>>> think this was causing the issue).
>>>
>>> -malloc_debug dumps quite a lot, this is supposed to be empty right?
>>> Output pasted below. It looks like the same sequence of calls is repeated 8
>>> times, which is how many nonlinear solves occurred in this particular run.
>>> Thoughts?
>>>
>>>
>>>
>>> [ 0]1408 bytes PetscSplitReductionCreate() line 63 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>>> [ 0]80 bytes PetscSplitReductionCreate() line 57 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
>>> [ 0]16 bytes PetscCommBuildTwoSided_Allreduce() line 169 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/mpits.c
>>> [ 0]16 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]272 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]880 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]960 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]976 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]1040 bytes ISGeneralSetIndices_General() line 578 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]16 bytes PetscLayoutSetUp() line 269 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]80 bytes PetscLayoutCreate() line 55 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
>>> [ 0]16 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]32 bytes PetscStrallocpy() line 187 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
>>> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
>>> [ 0]16 bytes ISCreate_General() line 647 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
>>> [ 0]896 bytes ISCreate() line 37 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
>>> [ 0]64 bytes ISColoringGetIS() line 266 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/utils/iscoloring.c
>>> [ 0]32 bytes PetscCommDuplicate() line 129 in
>>> /home/mlohry/dev/cmake-build/external/petsc/src/sys/objects/tagm.c
>>>
>>>
>>>
>>> On Wed, Aug 12, 2020 at 1:46 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>
>>>>
>>>> Mark.
>>>>
>>>> When valgrind is not feasible (like on many centrally controlled
>>>> batch systems) you can run PETSc with an extra flag to do some memory error
>>>> checks
>>>> -malloc_debug
>>>>
>>>> this
>>>>
>>>> 1) fills all malloced memory with Nan so if the code is using
>>>> uninitialized memory it may be detected and
>>>> 2) checks the beginning and end of each alloced memory region for
>>>> out-of-bounds writes at each malloc and free.
>>>>
>>>> it will slow the code down a little bit but generally not a huge amount.
>>>>
>>>> It is no where near as good as valgrind or other memory corruption
>>>> tools but it has the advantage you can run it anywhere on any size job.
>>>>
>>>>
>>>> Barry
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Aug 12, 2020, at 7:46 AM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>
>>>> On Wed, Aug 12, 2020 at 7:53 AM Mark Lohry <mlohry at gmail.com> wrote:
>>>>
>>>>> I'm getting seemingly random failures of late:
>>>>> Caught signal number 7 BUS: Bus Error, possibly illegal memory access
>>>>>
>>>>
>>>> The first thing I would do is run valgrind on as wide an array of tests
>>>> as you can. This will find problems
>>>> on things that run completely fine.
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>>
>>>>> Symptoms:
>>>>> 1) Seems to only happen (so far) on larger cases, 400-2000 cores
>>>>> 2) It doesn't happen right away -- this was running happily for
>>>>> several hours over several hundred time steps with no indication of bad
>>>>> health in the numerics
>>>>> 3) At least the total memory consumption seems to be within bounds,
>>>>> though I'm not sure about individual processes. e.g. slurm here reported
>>>>> Memory Efficiency: 75.23% of 1.76 TB (180.00 GB/node)
>>>>> 4) running the same setup twice it fails at different points
>>>>>
>>>>> Any suggestions on what to look for? This is a bit painful to work on
>>>>> as I can only reproduce it on large runs and then it's seemingly random.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200824/5c605e78/attachment-0001.html>
More information about the petsc-users
mailing list