[petsc-users] SuperLU_dist issue in 3.7.4
Satish Balay
balay at mcs.anl.gov
Mon Oct 10 12:12:48 CDT 2016
Is this test code valgrind clean?
Satish
On Mon, 10 Oct 2016, Kong, Fande wrote:
> I am working on reproducing the behaviors. I have hard time to reproduce
> because it behaviors randomly. There are two types of message showing up:
>
> (1) Segmentation fault 11.
>
> (2) On entry to DGEMM parameter number 10 had an illegal value
>
>
> If we use a debugger, this code always runs fine.
>
> PS, Anton, do you have a pure petsc code to reproduce this?
>
>
> Fande,
>
> On Mon, Oct 10, 2016 at 10:13 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> > Which version of superlu_dist does this capture? I looked at the
> > original error log, it pointed to pdgssvx: line 161. But that line is in
> > comment block, not the program.
> >
> > Sherry
> >
> >
> > On Mon, Oct 10, 2016 at 7:27 AM, Anton Popov <popov at uni-mainz.de> wrote:
> >
> >>
> >>
> >> On 10/07/2016 05:23 PM, Satish Balay wrote:
> >>
> >>> On Fri, 7 Oct 2016, Kong, Fande wrote:
> >>>
> >>> On Fri, Oct 7, 2016 at 9:04 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> >>>>
> >>>> On Fri, 7 Oct 2016, Anton Popov wrote:
> >>>>>
> >>>>> Hi guys,
> >>>>>>
> >>>>>> are there any news about fixing buggy behavior of SuperLU_DIST,
> >>>>>> exactly
> >>>>>>
> >>>>> what
> >>>>>
> >>>>>> is described here:
> >>>>>>
> >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.
> >>>>>>
> >>>>> mcs.anl.gov_pipermail_petsc-2Dusers_2015-2DAugust_026802.htm
> >>>>> l&d=CwIBAg&c=
> >>>>> 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=DUUt3SRGI0_
> >>>>> JgtNaS3udV68GRkgV4ts7XKfj2opmiCY&m=RwruX6ckX0t9H89Z6LXKBfJBOAM2vG
> >>>>> 1sQHw2tIsSQtA&s=bbB62oGLm582JebVs8xsUej_OX0eUwibAKsRRWKafos&e= ?
> >>>>>
> >>>>>> I'm using 3.7.4 and still get SEGV in pdgssvx routine. Everything
> >>>>>> works
> >>>>>>
> >>>>> fine
> >>>>>
> >>>>>> with 3.5.4.
> >>>>>>
> >>>>>> Do I still have to stick to maint branch, and what are the chances for
> >>>>>>
> >>>>> these
> >>>>>
> >>>>>> fixes to be included in 3.7.5?
> >>>>>>
> >>>>> 3.7.4. is off maint branch [as of a week ago]. So if you are seeing
> >>>>> issues with it - its best to debug and figure out the cause.
> >>>>>
> >>>>> This bug is indeed inside of superlu_dist, and we started having this
> >>>> issue
> >>>> from PETSc-3.6.x. I think superlu_dist developers should have fixed this
> >>>> bug. We forgot to update superlu_dist?? This is not a thing users could
> >>>> debug and fix.
> >>>>
> >>>> I have many people in INL suffering from this issue, and they have to
> >>>> stay
> >>>> with PETSc-3.5.4 to use superlu_dist.
> >>>>
> >>> To verify if the bug is fixed in latest superlu_dist - you can try
> >>> [assuming you have git - either from petsc-3.7/maint/master]:
> >>>
> >>> --download-superlu_dist --download-superlu_dist-commit=origin/maint
> >>>
> >>>
> >>> Satish
> >>>
> >>> Hi Satish,
> >> I did this:
> >>
> >> git clone -b maint https://bitbucket.org/petsc/petsc.git
> >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_petsc_petsc.git&d=CwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=DUUt3SRGI0_JgtNaS3udV68GRkgV4ts7XKfj2opmiCY&m=gyXH_At6miHgXOSSTqliRBgGMrjND-xWMtw674NdXTQ&s=Fwp7w8S-zkrWSgvYIonR4qghWFy4xD-dzQKr4j44zjk&e=>
> >> petsc
> >>
> >> --download-superlu_dist
> >> --download-superlu_dist-commit=origin/maint (not sure this is needed,
> >> since I'm already in maint)
> >>
> >> The problem is still there.
> >>
> >> Cheers,
> >> Anton
> >>
> >
> >
>
More information about the petsc-users
mailing list