[petsc-users] Optimized run crashes on one machine but not another

Garnet Vaz garnet.vaz at gmail.com
Wed Aug 28 16:49:49 CDT 2013


Hi Matt,

Within gdb how can I view an IS? I tried 'call ISView(*partition,0)'
following the VecView() syntax but it causes a segmentation fault
inside gdb.

-
Garnet


On Wed, Aug 28, 2013 at 2:02 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Aug 28, 2013 at 3:32 PM, Garnet Vaz <garnet.vaz at gmail.com> wrote:
>
>> Hi Matt,
>>
>> I just ran git clone https://bitbucket.org/petsc/petsc and built
>> the debug build. The code still crashes now with a slightly
>> different back trace. It looks like a request for a large (wrong)
>> amount of memory which could be from some unitialized value
>> I have lying about. I will look into this some more.
>>
>
> It would really help if you could track this down in the debugger. I am
> not getting
> that here. You would think I would get an unititialized report from the
> compiler.
>
>   Thanks,
>
>      Matt
>
>
>> Attached is the configure.log file for my current build.
>>
>> -
>> Garnet
>>
>>
>>
>> On Wed, Aug 28, 2013 at 1:08 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>
>>> On Wed, Aug 28, 2013 at 3:04 PM, Garnet Vaz <garnet.vaz at gmail.com>wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> I just built the 3.4.2 release in the hope that it will work. It was
>>>> working fine for the 'next'
>>>> branch until a recent update last night. I updated my laptop/desktop
>>>> with a 1/2 hour
>>>> gap which caused crashes in one build but not in the other. Hence, I
>>>> moved to the
>>>> 3.4.2 release.
>>>>
>>>> I will rebuild using the current 'next' and let you know if there are
>>>> any problems.
>>>>
>>>
>>> Can you send configure.log? I built against OpenMPI and it looks like a
>>> get a similar error
>>> which is not there with MPICH. Trying to confirm now.
>>>
>>>    Matt
>>>
>>>
>>>> Thanks.
>>>>
>>>> -
>>>> Garnet
>>>>
>>>>
>>>>
>>>> On Wed, Aug 28, 2013 at 12:51 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>>>
>>>>> On Wed, Aug 28, 2013 at 1:58 PM, Garnet Vaz <garnet.vaz at gmail.com>wrote:
>>>>>
>>>>>> Hi Matt,
>>>>>>
>>>>>> Attached is a folder containing the code and a sample mesh.
>>>>>>
>>>>>
>>>>> I have built and run it here with the 'next' branch from today, and it
>>>>> does not crash.
>>>>> What branch are you using?
>>>>>
>>>>>     Matt
>>>>>
>>>>>
>>>>>> Thanks for the help.
>>>>>>
>>>>>> -
>>>>>> Garnet
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 28, 2013 at 11:43 AM, Matthew Knepley <knepley at gmail.com>wrote:
>>>>>>
>>>>>>> On Wed, Aug 28, 2013 at 12:52 PM, Garnet Vaz <garnet.vaz at gmail.com>wrote:
>>>>>>>
>>>>>>>> Thanks Jed. I did as you told and the code finally crashes on both
>>>>>>>> builds. I installed the 3.4.2 release now.
>>>>>>>>
>>>>>>>> The problem now seems to come from DMPlexDistribute(). I have two
>>>>>>>> versions to load the mesh. One creates a mesh using Triangle
>>>>>>>> from PETSc and the other loads a mesh using
>>>>>>>> DMPlexCreateFromCellList().
>>>>>>>>
>>>>>>>> Is the following piece of code for creating a mesh using Triangle
>>>>>>>> right?
>>>>>>>>
>>>>>>>
>>>>>>> Okay, something is really very wrong here. It is calling
>>>>>>> EnlargePartition(), but for
>>>>>>> that path to be taken, you have to trip and earlier exception. It
>>>>>>> should not be possible
>>>>>>> to call it. So I think you have memory corruption somewhere.
>>>>>>>
>>>>>>> Can you send a sample code we can run?
>>>>>>>
>>>>>>>   Thanks,
>>>>>>>
>>>>>>>       Matt
>>>>>>>
>>>>>>>
>>>>>>>>   ierr =
>>>>>>>> DMPlexCreateBoxMesh(comm,2,interpolate,&user->dm);CHKERRQ(ierr);
>>>>>>>>   if (user->dm) {
>>>>>>>>     DM        refinedMesh     = NULL;
>>>>>>>>     DM        distributedMesh = NULL;
>>>>>>>>     ierr =
>>>>>>>> DMPlexSetRefinementLimit(user->dm,refinementLimit);CHKERRQ(ierr);
>>>>>>>>     ierr =
>>>>>>>> DMRefine(user->dm,PETSC_COMM_WORLD,&refinedMesh);CHKERRQ(ierr);
>>>>>>>>     if (refinedMesh) {
>>>>>>>>       ierr     = DMDestroy(&user->dm);CHKERRQ(ierr);
>>>>>>>>       user->dm = refinedMesh;
>>>>>>>>     }
>>>>>>>>     ierr   =
>>>>>>>> DMPlexDistribute(user->dm,"chaco",1,&distributedMesh);CHKERRQ(ierr);
>>>>>>>>     if (distributedMesh) {
>>>>>>>>       ierr = DMDestroy(&user->dm);CHKERRQ(ierr);
>>>>>>>>       user->dm  = distributedMesh;
>>>>>>>>     }
>>>>>>>>   }
>>>>>>>>
>>>>>>>> Using gdb, the code gives a SEGV during distribution. The backtrace
>>>>>>>> when the fault
>>>>>>>> occurs points to an invalid pointer for ISGetIndices(). Attached is
>>>>>>>> a screenshot of the
>>>>>>>> gdb backtrace.
>>>>>>>> Do I need to set up some index set here?
>>>>>>>>
>>>>>>>> The same error occurs when trying to distribute a mesh using
>>>>>>>> DMPlexCreateFromCellList().
>>>>>>>>
>>>>>>>> Thanks for the help.
>>>>>>>>
>>>>>>>>
>>>>>>>> -
>>>>>>>> Garnet
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 28, 2013 at 6:38 AM, Jed Brown <jedbrown at mcs.anl.gov>wrote:
>>>>>>>>
>>>>>>>>> Garnet Vaz <garnet.vaz at gmail.com> writes:
>>>>>>>>>
>>>>>>>>> > Hi,
>>>>>>>>> >
>>>>>>>>> > I just rebuilt PETSc on both my laptop and my desktop.
>>>>>>>>> > On both machines the output of >grep GIT configure.log
>>>>>>>>> >         Defined "VERSION_GIT" to
>>>>>>>>> > ""d8f7425765acda418e23a679c25fd616d9da8153""
>>>>>>>>> >         Defined "VERSION_DATE_GIT" to ""2013-08-27 10:05:35
>>>>>>>>> -0500""
>>>>>>>>>
>>>>>>>>> Thanks for the report.  Matt just merged a bunch of DMPlex-related
>>>>>>>>> branches (about 60 commits in total).  Can you 'git pull && make'
>>>>>>>>> to let
>>>>>>>>> us know if the problem is still there?  (It may not fix the issue,
>>>>>>>>> but
>>>>>>>>> at least we'll be debugging current code.)
>>>>>>>>>
>>>>>>>>> When dealing with debug vs. optimized issues, it's useful to
>>>>>>>>> configure
>>>>>>>>> --with-debugging=0 COPTFLAGS='-O2 -g'.  This allows valgrind to
>>>>>>>>> include
>>>>>>>>> line numbers, but it (usually!) does not affect whether the error
>>>>>>>>> occurs.
>>>>>>>>>
>>>>>>>>> > My code runs on both machines in the debug build without causing
>>>>>>>>> > any problems. When I try to run the optimized build, the code
>>>>>>>>> crashes
>>>>>>>>> > with a SEGV fault on my laptop but not on the desktop. I have
>>>>>>>>> built
>>>>>>>>> > PETSc using the same configure options.
>>>>>>>>> >
>>>>>>>>> > I have attached the outputs of valgrind for both my
>>>>>>>>> laptop/desktop for
>>>>>>>>> > both the debug/opt builds. How can I figure out what differences
>>>>>>>>> are
>>>>>>>>> > causing the errors in one case and not the other?
>>>>>>>>>
>>>>>>>>> It looks like an uninitialized variable.  Debug mode often ends up
>>>>>>>>> initializing local variables where as optimized leaves junk in
>>>>>>>>> them.
>>>>>>>>> Stack allocation alignment/padding is also often different.
>>>>>>>>> Unfortunately, valgrind is less powerful for debugging stack
>>>>>>>>> corruption,
>>>>>>>>> so the uninitialized warning is usually the best you get.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>> Garnet
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Garnet
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Garnet
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>>
>> --
>> Regards,
>> Garnet
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>



-- 
Regards,
Garnet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130828/9bee311c/attachment-0001.html>


More information about the petsc-users mailing list