[petsc-users] SuperLU_dist issue in 3.7.4

Hong hzhang at mcs.anl.gov
Fri Oct 21 11:17:43 CDT 2016


I can reproduce the error on a linux machine with petsc-maint. It crashes
at 2nd solve, on both processors:

Program received signal SIGSEGV, Segmentation fault.
0x00007f051dc835bd in pdgsequ (A=0x1563910, r=0x176dfe0, c=0x178f7f0,
    rowcnd=0x7fffcb8dab30, colcnd=0x7fffcb8dab38, amax=0x7fffcb8dab40,
    info=0x7fffcb8dab4c, grid=0x1563858)
    at
/sandbox/hzhang/petsc/arch-linux-gcc-gfortran/externalpackages/git.superlu_dist/SRC/pdgsequ.c:182
182                 c[jcol] = SUPERLU_MAX( c[jcol], fabs(Aval[j]) * r[irow]
);

The version of superlu_dist:
commit 0b5369f304507f1c7904a913f4c0c86777a60639
Author: Xiaoye Li <xsli at lbl.gov>
Date:   Thu May 26 11:33:19 2016 -0700

    rename 'struct pair' to 'struct superlu_pair'.

Hong

On Fri, Oct 21, 2016 at 5:36 AM, Anton Popov <popov at uni-mainz.de> wrote:

>
> On 10/19/2016 05:22 PM, Anton Popov wrote:
>
> I looked at each valgrind-complained item in your email dated Oct. 11.
> Those reports are really superficial; I don't see anything  wrong with
> those lines (mostly uninitialized variables) singled out.  I did a few
> tests with the latest version in github,  all went fine.
>
> Perhaps you can print your matrix that caused problem, I can run it using
>  your matrix.
>
> Sherry
>
> Hi Sherry,
>
> I finally figured out a minimalistic setup (attached) that reproduces the
> problem.
>
> I use petsc-maint:
>
> git clone -b maint https://bitbucket.org/petsc/petsc.git
>
> and configure it in the debug mode without optimization using the options:
>
> --download-superlu_dist=1 \
> --download-superlu_dist-commit=origin/maint \
>
> Compile the test, assuming PETSC_DIR points to the described petsc
> installation:
>
> make ex16
>
> Run with:
>
> mpirun -n 2 ./ex16 -f binaryoutput -pc_type lu
> -pc_factor_mat_solver_package superlu_dist
>
> Matrix partitioning between the processors will be completely the same as
> in our code (hard-coded).
>
> I factorize the same matrix twice with the same PC object. Remarkably it
> runs fine for the first time, but fails for the second.
>
> Thank you very much for looking into this problem.
>
> Cheers,
> Anton
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161021/dc346da2/attachment-0001.html>


More information about the petsc-users mailing list