[petsc-users] Setting up MUMPS in PETSc

Tue Oct 23 17:21:03 CDT 2012

That is new for me.  What would you suggest, Matt?

Thanks,

Jinquan

From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
Sent: Tuesday, October 23, 2012 3:19 PM
To: PETSc users list
Subject: Re: [petsc-users] Setting up MUMPS in PETSc

On Tue, Oct 23, 2012 at 6:15 PM, Jinquan Zhong <jzhong at scsolutions.com<mailto:jzhong at scsolutions.com>> wrote:
Hong and Jed,

For KSPSolver, what kind of PC is the most proper ?  I tested

ierr = PCSetType(pc,PCREDUNDANT);CHKERRQ(ierr);

It worked for small complex double matrix but not for the big ones.  Here is my set up

Did you read the documentation on this PC? At all? That it solves the entire system on each proces?

    Matt

            KSP            ksp;      /* linear solver context */
            PC       pc;

            ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
            ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);

            ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
            ierr = PCSetType(pc,PCREDUNDANT);CHKERRQ(ierr);
            KSPSetTolerances(ksp,1.e-12,1.e-12,PETSC_DEFAULT,PETSC_DEFAULT);

            ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
ierr = KSPSolve(ksp,b,*x);CHKERRQ(ierr);

Do you see any problem?

Thanks,

Jinquan

From: petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov> [mailto:petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov>] On Behalf Of Hong Zhang
Sent: Tuesday, October 23, 2012 12:03 PM
To: PETSc users list
Subject: Re: [petsc-users] Setting up MUMPS in PETSc

Jinquan:

I have a question on how to use mumps properly in PETSc.  It appears that I didn't set up mumps right.  I followed the example in
http://www.mcs.anl.gov/petsc/petsc-dev/src/mat/examples/tests/ex125.c.html

This example is for our internal testing, not intended for production runs.
Suggest using PETSc high level KSP solver which provides more flexibility.

to set up my program.  Here is my situation on using the default setting

        PetscInt<http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Sys/PetscInt.html#PetscInt> icntl_7 = 5;
        MatMumpsSetIcntl<http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Mat/MatMumpsSetIcntl.html#MatMumpsSetIcntl>(F,7,icntl_7);

in the example ex125.c:
Using KSP sover, the mumps options can be chosen at runtime,
e.g.
~petsc/src/ksp/ksp/examples/tutorials/ex2.c:
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package mumps -mat_mumps_icntl_7 5
Norm of error 1.49777e-15 iterations 1

1.       The program work fine on all small models (sparse matrices at  the order of m= 894, 1097, 31k with a dense matrix included in the sparse matrix). The residuals are at the magnitude of 10^-3.
                                                      ^^^^
With a direct solver, residual =  10^-3 indicates your matrix might be very ill-conditioned or close to singular. What do you get for |R|/|rhs| = ?
Is this the reason you want to use a direct solver instead of iterative one?
What do you mean "31k with a dense matrix included in the sparse matrix"?
How sparse is your matrix, e.g., nnz(A)/(m*m)=?

2.       The program has some issues on medium size problem  (m=460k with a dense matrix at the order of n=30k included in the sparse matrix).  The full sparse matrix is sized at 17GB.

a.       We used another software to generate sparse matrix by using 144 cores:

                                                               i.      When I used the resource from 144 cores (12 nodes with 48GB/node), it could not provide the solution.  There was a complain on the memory violation.

                                                             ii.      When I used the resource from 432 cores (36 nodes with 48GB/node), it provided the solution.
Direct solvers are notoriously memory consuming. It seems your matrix is quite dense, requiring more memory than 144 cores could provide.
What is "another software "?

b.      We used another software to generate the same sparse matrix by using 576 cores:

                                                               i.      When I used the resource from 576 cores (48 nodes with 48GB/node), it could not provide the solution.  There was a complain on the memory violation.

                                                             ii.      When I used the resource from 1152 cores (96 nodes with 48GB/node), it provided the solution.
Both a and b seem indicate that, you can use small num of cores to generate original matrix A, but need more cores (resource) to solve A x =b.
This is because A = LU, the factored matrices L and U require far more memory than original A. Run your code using KSP with your matrix data and option -ksp_view
e.g., petsc/src/ksp/ksp/examples/tutorials/ex10.c
mpiexec -n 2 ./ex10 -f <matrix binary data file> -pc_type lu -pc_factor_mat_solver_package mumps -ksp_view
...
then you'll see memory info provided by mumps.

3.       The program could not solve the large size problem (m=640k with a dense matrix at the order of n=178k included in the sparse matrix).  The full sparse matrix is sized at 511GB.

a.       We used another software to generate sparse matrix by using 900 cores:

                                                               i.      When I used the resource from 900 cores (75 nodes with 48GB/node), it could not provide the solution.  There was a complain on the memory violation.

                                                             ii.      When I used the resource from 2400 cores (200 nodes with 48GB/node), it STILL COULD NOT provide the solution.
Your computer system and software have limits. Find the answers to your 'medium size' problems first.

My confusion starts from the medium size problem:

*         It seems something was not right in the default setting in ex125.c for these problems.

*         I got the info that METIS was used instead of ParMETIS in solving these problems.
By default, petsc-mumps interface uses sequential symbolic factorization (analysis phase). Use '-mat_mumps_icntl_28 2' to switch to parallel.
I tested it, but seems parmetis is still not used. Check mumps manual
or contact mumps developer on how to use parmetis.

*         Furthermore, it appears that there was unreasonable demand on the solver even on the medium size problem.

*         I suspect one rank was trying to collect all data from other ranks.
Yes, analysis is sequential, and rhs vector must be in the host :-(
In general, direct solvers cannot be scaled to very large num of cores.

What other addition setting is needed for mumps such that it could deal with medium and large size problems?

Run your code with option '-help |grep mumps' and experiment with
various options, e.g., matrix orderings, nonzero-fills etc.
You may also try superlu_dist. Good luck!

Do you guys have similar experience on that?
I personally never used mumps  or superlu_dist for such large matrices.
Consult mumps developers.

Hong

Thanks,

Jinquan

--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121023/fdefd7c0/attachment-0001.html>