[petsc-users] SuperLU_dist issue in 3.7.4
Satish Balay
balay at mcs.anl.gov
Fri Oct 21 17:16:47 CDT 2016
The issue with this test code is - using MatLoad() twice [with the
same object - without destroying it]. Not sure if thats supporsed to
work..
Satish
On Fri, 21 Oct 2016, Hong wrote:
> I can reproduce the error on a linux machine with petsc-maint. It crashes
> at 2nd solve, on both processors:
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007f051dc835bd in pdgsequ (A=0x1563910, r=0x176dfe0, c=0x178f7f0,
> rowcnd=0x7fffcb8dab30, colcnd=0x7fffcb8dab38, amax=0x7fffcb8dab40,
> info=0x7fffcb8dab4c, grid=0x1563858)
> at
> /sandbox/hzhang/petsc/arch-linux-gcc-gfortran/externalpackages/git.superlu_dist/SRC/pdgsequ.c:182
> 182 c[jcol] = SUPERLU_MAX( c[jcol], fabs(Aval[j]) * r[irow]
> );
>
> The version of superlu_dist:
> commit 0b5369f304507f1c7904a913f4c0c86777a60639
> Author: Xiaoye Li <xsli at lbl.gov>
> Date: Thu May 26 11:33:19 2016 -0700
>
> rename 'struct pair' to 'struct superlu_pair'.
>
> Hong
>
> On Fri, Oct 21, 2016 at 5:36 AM, Anton Popov <popov at uni-mainz.de> wrote:
>
> >
> > On 10/19/2016 05:22 PM, Anton Popov wrote:
> >
> > I looked at each valgrind-complained item in your email dated Oct. 11.
> > Those reports are really superficial; I don't see anything wrong with
> > those lines (mostly uninitialized variables) singled out. I did a few
> > tests with the latest version in github, all went fine.
> >
> > Perhaps you can print your matrix that caused problem, I can run it using
> > your matrix.
> >
> > Sherry
> >
> > Hi Sherry,
> >
> > I finally figured out a minimalistic setup (attached) that reproduces the
> > problem.
> >
> > I use petsc-maint:
> >
> > git clone -b maint https://bitbucket.org/petsc/petsc.git
> >
> > and configure it in the debug mode without optimization using the options:
> >
> > --download-superlu_dist=1 \
> > --download-superlu_dist-commit=origin/maint \
> >
> > Compile the test, assuming PETSC_DIR points to the described petsc
> > installation:
> >
> > make ex16
> >
> > Run with:
> >
> > mpirun -n 2 ./ex16 -f binaryoutput -pc_type lu
> > -pc_factor_mat_solver_package superlu_dist
> >
> > Matrix partitioning between the processors will be completely the same as
> > in our code (hard-coded).
> >
> > I factorize the same matrix twice with the same PC object. Remarkably it
> > runs fine for the first time, but fails for the second.
> >
> > Thank you very much for looking into this problem.
> >
> > Cheers,
> > Anton
> >
>
More information about the petsc-users
mailing list