[petsc-users] Problem on LU factorization

Satish Balay balay at mcs.anl.gov
Tue Jan 25 23:19:27 CST 2011


I don't see anything obviously wrong with this build

I guess the other thing to do is to build debug version on the machine
and run in a debugger to determine the problem. [I believe there is a
way to debug on bgl..]

Satish

On Tue, 25 Jan 2011, Rongliang Chen wrote:

> Hi Balay,
> 
> Thank you for your reply.
> I have checked my code with valgrind on my own computer and there is no
> problem.
> But when I run my code on the IBM Blue Gene/L with
> "-sub_pc_factor_mat_solver_package superlu", it has such problem.
> Since there is not valgrind on IBM Blue Gene/L, I can not test my code with
> valgrind on it.
> 
> But if use the PETSC's default LU factorization, there is no such problem.
> So I suspect that there is some problem with my petsc's installation.
> Can you help me to check if my installation is correct?
> Following is the detail of the installation and the configure.log and
> make.log are attached.
> 
> Installing Petsc on IBM Blue Gene/L:
> 
> 1. patch -p0 < /contrib/bgl/petsc/petsc-3.0.0-p4/petsc-3.0.0-p4.patch
> 2. ./config/bgl-ibm-goto_lapack.py  and the  the "bgl-ibm-goto_lapack.py" is
> :
> ******************************************************************************************
> #!/usr/bin/env python
> #
> # BGL has broken 'libc' dependencies. The option 'LIBS' is used to
> # workarround this problem.
> #
> # LIBS="-lc -lnss_files -lnss_dns -lresolv"
> #
> # Another workarround is to modify mpicc/mpif77 scripts and make them
> # link with the corresponding compilers, and these additional
> # libraries. The following tarball has the modified compiler scripts
> #
> # ftp://ftp.mcs.anl.gov/pub/petsc/tmp/petsc-bgl-tools.tar.gz
> #
> configure_options = [
>   '--with-cc=/contrib/bgl/bin/mpxlc',
>   '--with-cxx=/contrib/bgl/bin/mpxlC',
>   '--with-fc=/contrib/bgl/bin/mpxlf -qnosave',
>   '--with-mpi-dir=/bgl/BlueLight/ppcfloor/bglsys',  # required by BLACS to
> get mpif.h
>   '--with-lapack-lib=/contrib/bgl/lib/liblapack440.a',
>   '--with-blas-lib=/contrib/bgl/lib/libblas440.a',
> #  '--with-blas-lapack-lib=-L/contrib/bgl/lib -llapack440 -L/contrib/bgl/lib
> -lgoto',
> 
>   '--with-is-color-value-type=short',
>   '--with-shared=0',
> 
>   '-COPTFLAGS=-O2 -qbgl -qarch=440d -qtune=440 -qmaxmem=-1',
>   '-CXXOPTFLAGS=-O2 -qbgl -qarch=440d -qtune=440 -qmaxmem=-1',
>   '-FOPTFLAGS=-O2 -qbgl -qarch=440d -qtune=440 -qmaxmem=-1',
>   '--with-debugging=0',
> 
>   # the following option gets automatically enabled on BGL/with IBM
> compilers.
>   # '--with-fortran-kernels=bgl'
> 
>   '--with-x=0',
>   '--with-x11=0',
>   '--with-batch=1',
>   '--with-memcmp-ok',
>   '--sizeof-char=1',
>   '--sizeof-void-p=4',
>   '--sizeof-short=2',
>   '--sizeof-int=4',
>   '--sizeof-long=4',
>   '--sizeof-size-t=4',
>   '--sizeof-long-long=8',
>   '--sizeof-float=4',
>   '--sizeof-double=8',
>   '--bits-per-byte=8',
>   '--sizeof-MPI-Comm=4',
>   '--sizeof-MPI-Fint=4',
>   '--have-mpi-long-double=1',
> 
> 
> '--download-superlu=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/superlu_4.0-March_7_2010.tar.gz',
> 
> '--download-superlu_dist=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/SuperLU_DIST_2.4-hg-v2.tar.gz',
> 
> '--download-parmetis=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/ParMetis-dev-p3.tar.gz',
> 
> '--download-scalapack=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/scalapack.tgz',
> 
> '--download-blacs=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/blacs-dev.tar.gz',
> 
> '--download-f-blas-lapack=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/fblaslapack-3.1.1.tar.gz',
> 
> '--download-mumps=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/MUMPS_4.9.2.tar.gz',
> 
> #  '--download-f-blas-lapack=1',
> #  '--download-hypre=1',
> #  '--download-spooles=1',
> #  '--download-superlu=1',
> #  '--download-parmetis=1',
> #  '--download-superlu_dist=1',
> #  '--download-blacs=1',
> 
>    '-PETSC_ARCH=bgl-ibm-goto-O3_440d'
>   ]
> 
> if __name__ == '__main__':
>   import sys,os
>   sys.path.insert(0,os.path.abspath('config'))
>   import configure
>   configure.petsc_configure(configure_options)
> 
> # Extra options used for testing locally
> test_options = []
> ************************************************************************
> 3. cqsub -n 1 -t 20 -O conftest -q debug ./conftest
> 4. ./reconfigure.py
> 5. make all
> 
> Thank you!
> 
> Best,
> 
> Rongliang
> 
> 
> ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Mon, 24 Jan 2011 16:06:22 -0600 (CST)
> > From: Satish Balay <balay at mcs.anl.gov>
> > Subject: Re: [petsc-users] Problem on LU factorization
> > To: PETSc users list <petsc-users at mcs.anl.gov>
> > Message-ID:
> >        <alpine.LFD.2.02.1101241605370.2510 at localhost6.localdomain6>
> > Content-Type: TEXT/PLAIN; charset=US-ASCII
> >
> > On Mon, 24 Jan 2011, Matthew Knepley wrote:
> >
> > > > When I use superlu with command line "-sub_pc_factor_mat_solver_package
> > > > superlu", it said
> > >
> > > "[43]PETSC ERROR:
> > > >
> > ------------------------------------------------------------------------
> > > > [43]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> > > > probably memory access out of range
> > > > [43]PETSC ERROR: Try option -start_in_debugger or
> > -on_error_attach_debugger
> > > > [43]PETSC ERROR: or see
> > > >
> > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[43]PETSCERROR:
> > or try
> > > > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
> > > > corruption errors
> > > > [43]PETSC ERROR: likely location of problem given in stack below
> > > > [43]PETSC ERROR: ---------------------  Stack Frames
> > > > ------------------------------------
> > > > [43]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> > > > available,
> > > > [43]PETSC ERROR:       INSTEAD the line number of the start of the
> > function
> > > > [43]PETSC ERROR:       is given.
> > > > [43]PETSC ERROR: [43] MatLUFactorNumeric_SuperLU line 121
> > > > src/mat/impls/aij/seq/superlu/superlu.c
> > > > [43]PETSC ERROR: [43] MatLUFactorNumeric line 2575
> > > > src/mat/interface/matrix.c
> > > > ............................
> > > >  "
> > > >
> > >
> > > Please confirm that you have the latest patch level. If so, send the
> > matrix
> > > in PETSc binary format to petsc-maint at mcs.anl.gov
> > > along with the precise solver options and output of -ksp_view.
> >
> > More likely there is memory corruption somewhere - should run this
> > code with valgrind to weed out such issues..
> >
> > Satish
> >
> >
> > ------------------------------
> >
> >
> 



More information about the petsc-users mailing list