[petsc-users] SuperLU_dist issue in 3.7.4

Kong, Fande fande.kong at inl.gov
Mon Oct 10 11:38:05 CDT 2016


I am working on reproducing the behaviors. I have hard time to reproduce
because it behaviors  randomly.  There are two types of message showing up:

(1) Segmentation fault 11.

(2) On entry to DGEMM parameter number 10 had an illegal value


If we use a debugger, this code always runs fine.

PS, Anton, do you have a pure petsc code to reproduce this?


Fande,

On Mon, Oct 10, 2016 at 10:13 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:

> Which version of superlu_dist does this capture?   I looked at the
> original error  log, it pointed to pdgssvx: line 161.  But that line is in
> comment block, not the program.
>
> Sherry
>
>
> On Mon, Oct 10, 2016 at 7:27 AM, Anton Popov <popov at uni-mainz.de> wrote:
>
>>
>>
>> On 10/07/2016 05:23 PM, Satish Balay wrote:
>>
>>> On Fri, 7 Oct 2016, Kong, Fande wrote:
>>>
>>> On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay <balay at mcs.anl.gov> wrote:
>>>>
>>>> On Fri, 7 Oct 2016, Anton Popov wrote:
>>>>>
>>>>> Hi guys,
>>>>>>
>>>>>> are there any news about fixing buggy behavior of SuperLU_DIST,
>>>>>> exactly
>>>>>>
>>>>> what
>>>>>
>>>>>> is described here:
>>>>>>
>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.
>>>>>>
>>>>> mcs.anl.gov_pipermail_petsc-2Dusers_2015-2DAugust_026802.htm
>>>>> l&d=CwIBAg&c=
>>>>> 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=DUUt3SRGI0_
>>>>> JgtNaS3udV68GRkgV4ts7XKfj2opmiCY&m=RwruX6ckX0t9H89Z6LXKBfJBOAM2vG
>>>>> 1sQHw2tIsSQtA&s=bbB62oGLm582JebVs8xsUej_OX0eUwibAKsRRWKafos&e=  ?
>>>>>
>>>>>> I'm using 3.7.4 and still get SEGV in pdgssvx routine. Everything
>>>>>> works
>>>>>>
>>>>> fine
>>>>>
>>>>>> with 3.5.4.
>>>>>>
>>>>>> Do I still have to stick to maint branch, and what are the chances for
>>>>>>
>>>>> these
>>>>>
>>>>>> fixes to be included in 3.7.5?
>>>>>>
>>>>> 3.7.4. is off maint branch [as of a week ago]. So if you are seeing
>>>>> issues with it - its best to debug and figure out the cause.
>>>>>
>>>>> This bug is indeed inside of superlu_dist, and we started having this
>>>> issue
>>>> from PETSc-3.6.x. I think superlu_dist developers should have fixed this
>>>> bug. We forgot to update superlu_dist??  This is not a thing users could
>>>> debug and fix.
>>>>
>>>> I have many people in INL suffering from this issue, and they have to
>>>> stay
>>>> with PETSc-3.5.4 to use superlu_dist.
>>>>
>>> To verify if the bug is fixed in latest superlu_dist - you can try
>>> [assuming you have git - either from petsc-3.7/maint/master]:
>>>
>>> --download-superlu_dist --download-superlu_dist-commit=origin/maint
>>>
>>>
>>> Satish
>>>
>>> Hi Satish,
>> I did this:
>>
>> git clone -b maint https://bitbucket.org/petsc/petsc.git
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_petsc_petsc.git&d=CwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY&m=gyXH_At6miHgXOSSTqliRBgGMrjND-xWMtw674NdXTQ&s=Fwp7w8S-zkrWSgvYIonR4qghWFy4xD-dzQKr4j44zjk&e=>
>> petsc
>>
>> --download-superlu_dist
>> --download-superlu_dist-commit=origin/maint (not sure this is needed,
>> since I'm already in maint)
>>
>> The problem is still there.
>>
>> Cheers,
>> Anton
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161010/cd4f5721/attachment.html>


More information about the petsc-users mailing list