[petsc-users] run direct linear solver in parallel

Max Ng maxwindiff at gmail.com
Mon Dec 13 09:12:46 CST 2010


Hi,

The error seems to be trapped by MPICH2's assertions. Is there some way to
propagate them to debuggers (gdb, whatever)?

Yep, I think I'll try SuperLU_dist again then.

Thanks for your advices!

Max

On Mon, Dec 13, 2010 at 11:00 PM, Hong Zhang <hzhang at mcs.anl.gov> wrote:

> Max,
> Does superlu_dist crash?
> Spooles has been out of support from its developers for more than 10 years.
> For small testing problems, it can be faster.
>
> Mumps is a good and robust direct solver we usually recommend, but it
> requires f90.
>
> Hong
>
> On Mon, Dec 13, 2010 at 8:34 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   The problem is not in PETSc.    Run in the debugger  and see exactly
> where this memcpy() overlap happens and if it can be fixed.
> >
> >  Barry
> >
> >
> > On Dec 13, 2010, at 4:30 AM, Max Ng wrote:
> >
> >> Hi,
> >>
> >> I am having a similar problem and I'm using PETSc 3.1-p6. I wish to use
> SPOOLES because I need to build on Windows with VC++ (and without a Fortran
> compiler). And in my tests somehow SPOOLES performs better than SuperLU.
> >>
> >> My program runs correctly in mpiexec -n 1. When I try mpiexec -n 2, I
> got this error:
> >>
> >> Assertion failed in file helper_fns.c at line 337: 0
> >> memcpy argument memory ranges overlap, dst_=0x972ef84 src_=0x972ef84
> len_=4
> >>
> >> internal ABORT - process 1
> >> Assertion failed in file helper_fns.c at line 337: 0
> >> memcpy argument memory ranges overlap, dst_=0x90c4018 src_=0x90c4018
> len_=4
> >>
> >> internal ABORT - process 0
> >> rank 1 in job 113  vm1_57881   caused collective abort of all ranks
> >>   exit status of rank 1: killed by signal 9
> >>
> >> Here is the source code:
> >>
> >>             // N = 40000, n = 20000, nnz = 9
> >>             //
> >>             MatCreate(comm, &mat);
> >>             MatSetType(mat, MATAIJ);
> >>             MatSetSizes(mat, n, n, N, N);
> >>             MatSeqAIJSetPreallocation(mat, nnz, PETSC_NULL);
> >>             MatMPIAIJSetPreallocation(mat, nnz, PETSC_NULL, nnz,
> PETSC_NULL);
> >>
> >>             // some code to fill the matrix values
> >>             // ...
> >>
> >>             KSPCreate(comm, &ksp);
> >>             KSPSetOperators(ksp, mat, mat, DIFFERENT_NONZERO_PATTERN);
> >>             KSPSetType(ksp, KSPPREONLY);
> >>
> >>             KSPGetPC(ksp, &pc);
> >>             PCSetType(pc, PCLU);
> >>             PCFactorSetMatSolverPackage(pc, MAT_SOLVER_SPOOLES);
> >>
> >>             KSPSetUp(ksp);
> >>
> >> It crashes at the KSPSetUp() statement.
> >>
> >> Do you have any ideas? Thanks in advance!
> >>
> >> Max Ng
> >>
> >> On Dec 3, 2010, at 4:19 PM, Xiangdong Liang wrote:
> >>
> >>> > Hi everyone,
> >>> >
> >>> > I am wondering how I can run the direct solver in parallel. I can run
> >>> > my program in a single processor with direct linear solver by
> >>> >
> >>> > ./foo.out  -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package spooles
> >>> >
> >>> > However, when I try to run it with mpi:
> >>> >
> >>> > mpirun.openmpi -np 2 ./foo.out -ksp_type preonly -pc_type lu
> >>> > -pc_factor_mat_solver_package spooles
> >>> >
> >>> > I got error like this:
> >>> >
> >>> > [0]PETSC ERROR: --------------------- Error Message
> >>> > ------------------------------------
> >>> > [0]PETSC ERROR: No support for this operation for this object type!
> >>> > [0]PETSC ERROR: Matrix type mpiaij  symbolic LU!
> >>> >
> >>> > [0]PETSC ERROR: Libraries linked from
> >>> > /home/hazelsct/petsc-2.3.3/lib/linux-gnu-c-opt
> >>> > [0]PETSC ERROR: Configure run at Mon Jun 30 14:37:52 2008
> >>> > [0]PETSC ERROR: Configure options --with-shared --with-dynamic
> >>> > --with-debugging=0 --useThreads 0 --with-mpi-dir=/usr/lib/openmpi
> >>> > --with-mpi-shared=1 --with-blas-lib=-lblas --with-lapack-lib=-llapack
> >>> > --with-umfpack=1 --with-umfpack-include=/usr/include/suitesparse
> >>> > --with-umfpack-lib="[/usr/lib/libumfpack.so,/usr/lib/libamd.so]"
> >>> > --with-superlu=1 --with-superlu-include=/usr/include/superlu
> >>> > --with-superlu-lib=/usr/lib/libsuperlu.so --with-spooles=1
> >>> > --with-spooles-include=/usr/include/spooles
> >>> > --with-spooles-lib=/usr/lib/libspooles.so --with-hypre=1
> >>> > --with-hypre-dir=/usr --with-babel=1 --with-babel-dir=/usr
> >>> > [0]PETSC ERROR:
> >>> >
> ------------------------------------------------------------------------
> >>> > [0]PETSC ERROR: MatLUFactorSymbolic() line 2174 in
> src/mat/interface/matrix.c
> >>> > [0]PETSC ERROR: PCSetUp_LU() line 257 in
> src/ksp/pc/impls/factor/lu/lu.c
> >>> > -------------------------------------------------------
> >>> >
> >>> > Would you like to tell me where I am doing wrong? I appreciate your
> help.
> >>> >
> >>> > Xiangdong
> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20101213/468c1bfc/attachment.htm>


More information about the petsc-users mailing list