[petsc-users] Error using MUMPS to solve large linear system

Samar Khatiwala spk at ldeo.columbia.edu
Mon Feb 24 09:42:07 CST 2014


I'm trying to solve a linear system with MUMPS and keep getting an error that ends with:
 On return from DMUMPS, INFOG(1)=        -100
 On return from DMUMPS, INFOG(2)=      -32766

I've looked at the MUMPS documentation but can't figure out what that means. This is a large (2346346 x 2346346) sparse 
matrix (read from file) and the code works fine on a (much) smaller one leading me to think this is related to memory and 
this problem is simply too big to solve with a sparse direct solver. Throwing more CPUs at the problem doesn't solve the 
problem or change the above error.

This is with PETSc 3.4.3 on Yellowstone. The standard error looks like this:

[161]PETSC ERROR: --------------------- Error Message ------------------------------------
[161]PETSC ERROR: Error in external library!
[161]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFO(1)=-1, INFO(2)=48
[161]PETSC ERROR: ------------------------------------------------------------------------
[161]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 
[161]PETSC ERROR: See docs/changes/index.html for recent updates.
[161]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[161]PETSC ERROR: See docs/index.html for manual pages.
[161]PETSC ERROR: ------------------------------------------------------------------------
[161]PETSC ERROR: ./linearsolve on a arch-linux2-c-opt named ys0805 by spk Mon Feb 24 08:20:15 2014
[161]PETSC ERROR: Libraries linked from /glade/u/home/spk/petsc-3.4.3/arch-linux2-c-opt/lib
[161]PETSC ERROR: Configure run at Sun Feb 16 05:17:20 2014
[161]PETSC ERROR: Configure options --with-mpi-dir=/ncar/opt/intel/ --with-debugging=0 --with-shared-libraries=0 --download-superlu=1 --download-superlu_dist=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-parmetis=1 --download-metis --with-blas-lapack-dir=/ncar/opt/intel/ FFLAGS="-convert big_endian -assume byterecl"
[161]PETSC ERROR: ------------------------------------------------------------------------
[161]PETSC ERROR: MatFactorNumeric_MUMPS() line 722 in /glade/u/home/spk/petsc-3.4.3/src/mat/impls/aij/mpi/mumps/mumps.c
[161]PETSC ERROR: MatLUFactorNumeric() line 2889 in /glade/u/home/spk/petsc-3.4.3/src/mat/interface/matrix.c
[161]PETSC ERROR: [168]PETSC ERROR: KSPSetUp() line 278 in /glade/u/home/spk/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c
[168]PETSC ERROR: KSPSolve() line 399 in /glade/u/home/spk/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c
[168]PETSC ERROR: main() line 123 in linearsolve.c
Abort(76) on node 168 (rank 168 in comm 1140850688): application called MPI_Abort(MPI_COMM_WORLD, 76) - process 168
PCSetUp_LU() line 152 in /glade/u/home/spk/petsc-3.4.3/src/ksp/pc/impls/factor/lu/lu.c
[161]PETSC ERROR: PCSetUp() line 890 in /glade/u/home/spk/petsc-3.4.3/src/ksp/pc/interface/precon.c
[161]PETSC ERROR: KSPSetUp() line 278 in /glade/u/home/spk/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c
[161]PETSC ERROR: KSPSolve() line 399 in /glade/u/home/spk/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c
[161]PETSC ERROR: main() line 123 in linearsolve.c

etc etc. The above error I can tell has something to do with processor 48 (INFO(2)) and so forth but not the previous one.

The full output enabled with -mat_mumps_icntl_4 3 looks as in the attached file. Any hints as to what could be giving this 
error would be very much appreciated.

Thanks very much!


-------------- next part --------------
A non-text attachment was scrubbed...
Name: log.gz
Type: application/x-gzip
Size: 2207 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140224/4618b436/attachment-0001.bin>

More information about the petsc-users mailing list