[petsc-users] Optimized run crashes on one machine but not another

Matthew Knepley knepley at gmail.com
Wed Aug 28 14:51:42 CDT 2013


On Wed, Aug 28, 2013 at 1:58 PM, Garnet Vaz <garnet.vaz at gmail.com> wrote:

> Hi Matt,
>
> Attached is a folder containing the code and a sample mesh.
>

I have built and run it here with the 'next' branch from today, and it does
not crash.
What branch are you using?

    Matt


> Thanks for the help.
>
> -
> Garnet
>
>
> On Wed, Aug 28, 2013 at 11:43 AM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Wed, Aug 28, 2013 at 12:52 PM, Garnet Vaz <garnet.vaz at gmail.com>wrote:
>>
>>> Thanks Jed. I did as you told and the code finally crashes on both
>>> builds. I installed the 3.4.2 release now.
>>>
>>> The problem now seems to come from DMPlexDistribute(). I have two
>>> versions to load the mesh. One creates a mesh using Triangle
>>> from PETSc and the other loads a mesh using DMPlexCreateFromCellList().
>>>
>>> Is the following piece of code for creating a mesh using Triangle right?
>>>
>>
>> Okay, something is really very wrong here. It is calling
>> EnlargePartition(), but for
>> that path to be taken, you have to trip and earlier exception. It should
>> not be possible
>> to call it. So I think you have memory corruption somewhere.
>>
>> Can you send a sample code we can run?
>>
>>   Thanks,
>>
>>       Matt
>>
>>
>>>   ierr = DMPlexCreateBoxMesh(comm,2,interpolate,&user->dm);CHKERRQ(ierr);
>>>   if (user->dm) {
>>>     DM        refinedMesh     = NULL;
>>>     DM        distributedMesh = NULL;
>>>     ierr =
>>> DMPlexSetRefinementLimit(user->dm,refinementLimit);CHKERRQ(ierr);
>>>     ierr =
>>> DMRefine(user->dm,PETSC_COMM_WORLD,&refinedMesh);CHKERRQ(ierr);
>>>     if (refinedMesh) {
>>>       ierr     = DMDestroy(&user->dm);CHKERRQ(ierr);
>>>       user->dm = refinedMesh;
>>>     }
>>>     ierr   =
>>> DMPlexDistribute(user->dm,"chaco",1,&distributedMesh);CHKERRQ(ierr);
>>>     if (distributedMesh) {
>>>       ierr = DMDestroy(&user->dm);CHKERRQ(ierr);
>>>       user->dm  = distributedMesh;
>>>     }
>>>   }
>>>
>>> Using gdb, the code gives a SEGV during distribution. The backtrace when
>>> the fault
>>> occurs points to an invalid pointer for ISGetIndices(). Attached is a
>>> screenshot of the
>>> gdb backtrace.
>>> Do I need to set up some index set here?
>>>
>>> The same error occurs when trying to distribute a mesh using
>>> DMPlexCreateFromCellList().
>>>
>>> Thanks for the help.
>>>
>>>
>>> -
>>> Garnet
>>>
>>>
>>> On Wed, Aug 28, 2013 at 6:38 AM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>>>
>>>> Garnet Vaz <garnet.vaz at gmail.com> writes:
>>>>
>>>> > Hi,
>>>> >
>>>> > I just rebuilt PETSc on both my laptop and my desktop.
>>>> > On both machines the output of >grep GIT configure.log
>>>> >         Defined "VERSION_GIT" to
>>>> > ""d8f7425765acda418e23a679c25fd616d9da8153""
>>>> >         Defined "VERSION_DATE_GIT" to ""2013-08-27 10:05:35 -0500""
>>>>
>>>> Thanks for the report.  Matt just merged a bunch of DMPlex-related
>>>> branches (about 60 commits in total).  Can you 'git pull && make' to let
>>>> us know if the problem is still there?  (It may not fix the issue, but
>>>> at least we'll be debugging current code.)
>>>>
>>>> When dealing with debug vs. optimized issues, it's useful to configure
>>>> --with-debugging=0 COPTFLAGS='-O2 -g'.  This allows valgrind to include
>>>> line numbers, but it (usually!) does not affect whether the error
>>>> occurs.
>>>>
>>>> > My code runs on both machines in the debug build without causing
>>>> > any problems. When I try to run the optimized build, the code crashes
>>>> > with a SEGV fault on my laptop but not on the desktop. I have built
>>>> > PETSc using the same configure options.
>>>> >
>>>> > I have attached the outputs of valgrind for both my laptop/desktop for
>>>> > both the debug/opt builds. How can I figure out what differences are
>>>> > causing the errors in one case and not the other?
>>>>
>>>> It looks like an uninitialized variable.  Debug mode often ends up
>>>> initializing local variables where as optimized leaves junk in them.
>>>> Stack allocation alignment/padding is also often different.
>>>> Unfortunately, valgrind is less powerful for debugging stack corruption,
>>>> so the uninitialized warning is usually the best you get.
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Garnet
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> Regards,
> Garnet
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130828/ffbe3c1d/attachment.html>


More information about the petsc-users mailing list