[petsc-users] Optimized run crashes on one machine but not another
Garnet Vaz
garnet.vaz at gmail.com
Wed Aug 28 15:32:07 CDT 2013
Hi Matt,
I just ran git clone https://bitbucket.org/petsc/petsc and built
the debug build. The code still crashes now with a slightly
different back trace. It looks like a request for a large (wrong)
amount of memory which could be from some unitialized value
I have lying about. I will look into this some more.
Attached is the configure.log file for my current build.
-
Garnet
On Wed, Aug 28, 2013 at 1:08 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Wed, Aug 28, 2013 at 3:04 PM, Garnet Vaz <garnet.vaz at gmail.com> wrote:
>
>> Hi Matt,
>>
>> I just built the 3.4.2 release in the hope that it will work. It was
>> working fine for the 'next'
>> branch until a recent update last night. I updated my laptop/desktop with
>> a 1/2 hour
>> gap which caused crashes in one build but not in the other. Hence, I
>> moved to the
>> 3.4.2 release.
>>
>> I will rebuild using the current 'next' and let you know if there are any
>> problems.
>>
>
> Can you send configure.log? I built against OpenMPI and it looks like a
> get a similar error
> which is not there with MPICH. Trying to confirm now.
>
> Matt
>
>
>> Thanks.
>>
>> -
>> Garnet
>>
>>
>>
>> On Wed, Aug 28, 2013 at 12:51 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>
>>> On Wed, Aug 28, 2013 at 1:58 PM, Garnet Vaz <garnet.vaz at gmail.com>wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> Attached is a folder containing the code and a sample mesh.
>>>>
>>>
>>> I have built and run it here with the 'next' branch from today, and it
>>> does not crash.
>>> What branch are you using?
>>>
>>> Matt
>>>
>>>
>>>> Thanks for the help.
>>>>
>>>> -
>>>> Garnet
>>>>
>>>>
>>>> On Wed, Aug 28, 2013 at 11:43 AM, Matthew Knepley <knepley at gmail.com>wrote:
>>>>
>>>>> On Wed, Aug 28, 2013 at 12:52 PM, Garnet Vaz <garnet.vaz at gmail.com>wrote:
>>>>>
>>>>>> Thanks Jed. I did as you told and the code finally crashes on both
>>>>>> builds. I installed the 3.4.2 release now.
>>>>>>
>>>>>> The problem now seems to come from DMPlexDistribute(). I have two
>>>>>> versions to load the mesh. One creates a mesh using Triangle
>>>>>> from PETSc and the other loads a mesh using
>>>>>> DMPlexCreateFromCellList().
>>>>>>
>>>>>> Is the following piece of code for creating a mesh using Triangle
>>>>>> right?
>>>>>>
>>>>>
>>>>> Okay, something is really very wrong here. It is calling
>>>>> EnlargePartition(), but for
>>>>> that path to be taken, you have to trip and earlier exception. It
>>>>> should not be possible
>>>>> to call it. So I think you have memory corruption somewhere.
>>>>>
>>>>> Can you send a sample code we can run?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>>> ierr =
>>>>>> DMPlexCreateBoxMesh(comm,2,interpolate,&user->dm);CHKERRQ(ierr);
>>>>>> if (user->dm) {
>>>>>> DM refinedMesh = NULL;
>>>>>> DM distributedMesh = NULL;
>>>>>> ierr =
>>>>>> DMPlexSetRefinementLimit(user->dm,refinementLimit);CHKERRQ(ierr);
>>>>>> ierr =
>>>>>> DMRefine(user->dm,PETSC_COMM_WORLD,&refinedMesh);CHKERRQ(ierr);
>>>>>> if (refinedMesh) {
>>>>>> ierr = DMDestroy(&user->dm);CHKERRQ(ierr);
>>>>>> user->dm = refinedMesh;
>>>>>> }
>>>>>> ierr =
>>>>>> DMPlexDistribute(user->dm,"chaco",1,&distributedMesh);CHKERRQ(ierr);
>>>>>> if (distributedMesh) {
>>>>>> ierr = DMDestroy(&user->dm);CHKERRQ(ierr);
>>>>>> user->dm = distributedMesh;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> Using gdb, the code gives a SEGV during distribution. The backtrace
>>>>>> when the fault
>>>>>> occurs points to an invalid pointer for ISGetIndices(). Attached is a
>>>>>> screenshot of the
>>>>>> gdb backtrace.
>>>>>> Do I need to set up some index set here?
>>>>>>
>>>>>> The same error occurs when trying to distribute a mesh using
>>>>>> DMPlexCreateFromCellList().
>>>>>>
>>>>>> Thanks for the help.
>>>>>>
>>>>>>
>>>>>> -
>>>>>> Garnet
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 28, 2013 at 6:38 AM, Jed Brown <jedbrown at mcs.anl.gov>wrote:
>>>>>>
>>>>>>> Garnet Vaz <garnet.vaz at gmail.com> writes:
>>>>>>>
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > I just rebuilt PETSc on both my laptop and my desktop.
>>>>>>> > On both machines the output of >grep GIT configure.log
>>>>>>> > Defined "VERSION_GIT" to
>>>>>>> > ""d8f7425765acda418e23a679c25fd616d9da8153""
>>>>>>> > Defined "VERSION_DATE_GIT" to ""2013-08-27 10:05:35 -0500""
>>>>>>>
>>>>>>> Thanks for the report. Matt just merged a bunch of DMPlex-related
>>>>>>> branches (about 60 commits in total). Can you 'git pull && make' to
>>>>>>> let
>>>>>>> us know if the problem is still there? (It may not fix the issue,
>>>>>>> but
>>>>>>> at least we'll be debugging current code.)
>>>>>>>
>>>>>>> When dealing with debug vs. optimized issues, it's useful to
>>>>>>> configure
>>>>>>> --with-debugging=0 COPTFLAGS='-O2 -g'. This allows valgrind to
>>>>>>> include
>>>>>>> line numbers, but it (usually!) does not affect whether the error
>>>>>>> occurs.
>>>>>>>
>>>>>>> > My code runs on both machines in the debug build without causing
>>>>>>> > any problems. When I try to run the optimized build, the code
>>>>>>> crashes
>>>>>>> > with a SEGV fault on my laptop but not on the desktop. I have built
>>>>>>> > PETSc using the same configure options.
>>>>>>> >
>>>>>>> > I have attached the outputs of valgrind for both my laptop/desktop
>>>>>>> for
>>>>>>> > both the debug/opt builds. How can I figure out what differences
>>>>>>> are
>>>>>>> > causing the errors in one case and not the other?
>>>>>>>
>>>>>>> It looks like an uninitialized variable. Debug mode often ends up
>>>>>>> initializing local variables where as optimized leaves junk in them.
>>>>>>> Stack allocation alignment/padding is also often different.
>>>>>>> Unfortunately, valgrind is less powerful for debugging stack
>>>>>>> corruption,
>>>>>>> so the uninitialized warning is usually the best you get.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Garnet
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Garnet
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>>
>> --
>> Regards,
>> Garnet
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
--
Regards,
Garnet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130828/81949437/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: confi.tar.gz
Type: application/x-gzip
Size: 338725 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130828/81949437/attachment-0001.bin>
More information about the petsc-users
mailing list