[petsc-users] Bus Error

Barry Smith bsmith at petsc.dev
Wed Aug 12 15:22:02 CDT 2020


   Yes, there are some PETSc objects or arrays that you are not freeing so they are printed at the end of the run. For small runs this harmless but if new objects/memory is allocated at each iteration and not suitably freed it will eventually add up.

    Run with -malloc_view (small problem with say 2 iterations) it will print everything allocated and might be helpful.

   Perhaps you are calling ISColoringGetIS() and not calling ISColoringRestoreIS()? 

   It is also possible it is a leak in PETSc, but that is unlikely since we test for them.

   Are you using Fortran? 

  Barry


> On Aug 12, 2020, at 1:29 PM, Mark Lohry <mlohry at gmail.com> wrote:
> 
> Thanks Matt and Barry. At Matt's suggestion I ran a smaller representative case with valgrind and didn't see anything alarming (apart from a small leak in an older boost version I was using: https://github.com/boostorg/serialization/issues/104 <https://github.com/boostorg/serialization/issues/104>  although I don't think this was causing the issue).
> 
> -malloc_debug dumps quite a lot, this is supposed to be empty right? Output pasted below. It looks like the same sequence of calls is repeated 8 times, which is how many nonlinear solves occurred in this particular run. Thoughts?
> 
> 
> 
> [ 0]1408 bytes PetscSplitReductionCreate() line 63 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
> [ 0]80 bytes PetscSplitReductionCreate() line 57 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
> [ 0]16 bytes PetscCommBuildTwoSided_Allreduce() line 169 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/mpits.c
> [ 0]16 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]272 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]880 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]960 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]976 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]1040 bytes ISGeneralSetIndices_General() line 578 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]64 bytes ISColoringGetIS() line 266 in /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/utils/iscoloring.c
> [ 0]32 bytes PetscCommDuplicate() line 129 in /home/mlohry/dev/cmake-build/external/petsc/src/sys/objects/tagm.c
> 
> 
> 
> On Wed, Aug 12, 2020 at 1:46 PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
> 
>    Mark.
> 
>     When valgrind is not feasible (like on many centrally controlled batch systems) you can run PETSc with an extra flag to do some memory error checks
>  -malloc_debug
> 
>  this 
> 
> 1) fills all malloced memory with Nan so if the code is using uninitialized memory it may be detected and 
> 2) checks the beginning and end of each alloced memory region for out-of-bounds writes at each malloc and free.
> 
> it will slow the code down a little bit but generally not a huge amount.
> 
> It is no where near as good as valgrind or other memory corruption tools but it has the advantage you can run it anywhere on any size job.
> 
> 
>   Barry
> 
> 
> 
> 
> 
>> On Aug 12, 2020, at 7:46 AM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> 
>> On Wed, Aug 12, 2020 at 7:53 AM Mark Lohry <mlohry at gmail.com <mailto:mlohry at gmail.com>> wrote:
>> I'm getting seemingly random failures of late:
>> Caught signal number 7 BUS: Bus Error, possibly illegal memory access
>> 
>> The first thing I would do is run valgrind on as wide an array of tests as you can. This will find problems
>> on things that run completely fine.
>> 
>>   Thanks,
>> 
>>      Matt
>>  
>> Symptoms:
>> 1) Seems to only happen (so far) on larger cases, 400-2000 cores
>> 2) It doesn't happen right away -- this was running happily for several hours over several hundred time steps with no indication of bad health in the numerics
>> 3) At least the total memory consumption seems to be within bounds, though I'm not sure about individual processes. e.g. slurm here reported Memory Efficiency: 75.23% of 1.76 TB (180.00 GB/node)
>> 4) running the same setup twice it fails at different points
>> 
>> Any suggestions on what to look for? This is a bit painful to work on as I can only reproduce it on large runs and then it's seemingly random.
>> 
>> 
>> Thanks,
>> Mark
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200812/ec0b47ec/attachment.html>


More information about the petsc-users mailing list