[petsc-users] Receiving DIVERGED_PCSETUP_FAILED

Faraz Hussain faraz_hussain at yahoo.com
Wed Jun 22 11:24:05 CDT 2016


My application already has Intel pardiso solver built-in, but it can only run on a single compute node ( no mpi ). Using a 3million^2 sparse symmetric matrix as a benchmark:
Pardiso using all 24 cpus on 1 compute nodes factors it in 3 minutes
Mumps takes:
24 cpus ( 1 node ) : 15 minutes48 cpus ( 2 nodes ) : 9 minutes72 cpus ( 3 nodes ) : 6 minutes

So mumps does scale well with more cpus, but still can not match pardiso solver on one compute node.  I am not sure if this is expected behavior of Mumps or I need to spend more time playing with all the parameters in mumps...

      From: Hong <hzhang at mcs.anl.gov>
 To: Faraz Hussain <faraz_hussain at yahoo.com> 
Cc: Barry Smith <bsmith at mcs.anl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Wednesday, June 22, 2016 9:53 AM
 Subject: Re: [petsc-users] Receiving DIVERGED_PCSETUP_FAILED
   
Faraz:

Just an update, I got this to work by rebuilding petsc with parmetis, metis and ptscotch.Then I used these settings for Mumps:
   icntl = 28; ival = 2;
    ierr = MatMumpsSetIcntl(F,icntl,ival);CHKERRQ(ierr);

    icntl = 29; ival = 1;
    ierr = MatMumpsSetIcntl(F,icntl,ival);CHKERRQ(ierr);

 The options use MUMPS  parallel symbolic factorization with ptscotch matrix ordering.

It still took 4X longer to solve than Intel Pardiso. But after re-configuring petsc with-debugging=0, it ran faster. Still slower than Pardiso, but only 2X slower. 

 I've seen report that  Intel Pardiso is much faster than mumps, e.g.,slepc developer Jose sent me following:With mumps:

MatSolve              16 1.0 1.0962e+01
MatLUFactorSym         1 1.0 3.1131e+00
MatLUFactorNum         1 1.0 2.6120e+00

With mkl_pardiso:

MatSolve              16 1.0 6.4163e-01
MatLUFactorSym         1 1.0 2.4772e+00
MatLUFactorNum         1 1.0 8.6419e-01

However, petsc only interfaces with sequential mkl_pardiso. Did you get results in parallel or sequential?
Hong




      From: Faraz Hussain <faraz_hussain at yahoo.com>
 To: Barry Smith <bsmith at mcs.anl.gov> 
Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Friday, June 10, 2016 5:27 PM
 Subject: Re: [petsc-users] Receiving DIVERGED_PCSETUP_FAILED
   
I think the issue is I need to play more with the "parrallel" settings here. 

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERMUMPS.html
The example 52.c was based on sequential running. So doing mpiexec -np 48, was basically just using one processor. I also never installed parmetis, metis or ptscotch. 

Will install and adjust the MUMPS settings and hopefully will get it to converge this weekend!

      From: Barry Smith <bsmith at mcs.anl.gov>
 To: Faraz Hussain <faraz_hussain at yahoo.com> 
Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Friday, June 10, 2016 4:02 PM
 Subject: Re: [petsc-users] Receiving DIVERGED_PCSETUP_FAILED
  

> On Jun 10, 2016, at 3:56 PM, Faraz Hussain <faraz_hussain at yahoo.com> wrote:
> 
> Thanks for the suggestions. I checked but was not able to find out how to change the mumps row pivot threshold of 1e-06. Maybe I will ask on the mumps user forum.

This might help. 

  ierr = PetscOptionsReal("-mat_mumps_cntl_1","CNTL(1): relative pivoting threshold","None",mumps->id.CNTL(1),&mumps->id.CNTL(1),NULL);CHKERRQ(ierr);
  ierr = PetscOptionsReal("-mat_mumps_cntl_2","CNTL(2): stopping criterion of refinement","None",mumps->id.CNTL(2),&mumps->id.CNTL(2),NULL);CHKERRQ(ierr);
  ierr = PetscOptionsReal("-mat_mumps_cntl_3","CNTL(3): absolute pivoting threshold","None",mumps->id.CNTL(3),&mumps->id.CNTL(3),NULL);CHKERRQ(ierr);
  ierr = PetscOptionsReal("-mat_mumps_cntl_4","CNTL(4): value for static pivoting","None",mumps->id.CNTL(4),&mumps->id.CNTL(4),NULL);CHKERRQ(ierr);

Note I don't know what they mean so you need to read the mumps docs.

> Regarding:
> 
>  > You need to look at the condition number just before GMRES reaches the restart. It has to start all over again at the restart. So what was the estimated condition number at 999 iterations?
> 
> I ran again and the condition number at 999 iterations is: 
> 
> 999 KSP preconditioned resid norm 5.921717188418e-02 true resid norm 5.921717188531e-02 ||r(i)||/||b|| 4.187286380279e-03 
> 999 KSP Residual norm 5.921717188418e-02 % max 1.070338898624e+05 min 1.002755075294e-01 max/min 1.067398136390e+06

Ok, so relatively ill-conditioned matrix. But seemingly not terrible.

  Barry

> 
> 
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Faraz Hussain <faraz_hussain at yahoo.com> 
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Sent: Thursday, June 9, 2016 5:56 PM
> Subject: Re: [petsc-users] Receiving DIVERGED_PCSETUP_FAILED
> 
> 
> > On Jun 9, 2016, at 3:32 PM, Faraz Hussain <faraz_hussain at yahoo.com> wrote:
> > 
> > I have been following ex52.c ksp/ksp/examples/tutorials to use MUMPS to directly solve Ax=b. My matrix is symmetric and positive definite. I built a small cantilever beam model with matrix of 5000^2 size. It solves in 2 seconds and gives correct answer. But when I use a finer mesh of the cantilever beam with 3.3 million^2 matrix, I get the following error:
> > 
> >  Mumps row pivot threshhold = 1e-06
> 
>  Maybe you can change this to get MUMPS to pivot less aggressively. Doing lots of pivoting will require a lot more memory. In theory since it is SPD it should not need to pivot at all.
> 
> >  Mumps determinant = (0., 0.) * 2^0
> > Linear solve did not converge due to DIVERGED_PCSETUP_FAILED iterations 0
> >              PCSETUP_FAILED due to FACTOR_OUTMEMORY
> > Norm of error inf. iterations 0
> > 
> > It runs for more than an hour before aborting with this message. I am running it with this command:
> > 
> > mpiexec -hostfile ./hostfile -np 48 ./ex12 -ksp_converged_reason
> > 
> > My machines have 24 cpus and 125GB Ram. When I do "top" I see it correctly spans 48 processes on 2 nodes. The memory usage of each process is no more than 1-2GB. So I do not understand why it gives FACTOR_OUTMEMORY ?
> > 
> > The same matrix solves in under 5 minutes in Intel Pardiso using 24 cpus on one host. 
> 
>  Mumps may be (likely?) is using a different matrix ordering then Intel Pardiso. Unfortunately each of these packages have a different way of asking for orderings and different orderings to chose from so you will need to look at the details for each package.
> 
> > I thought maybe mumps thinks it is ill-conditioned? The model does converge in the iterative solver in 4000 iterations. I also tried running with these options per the FAQ on 
> > 
> > " How can I determine the condition number of a matrix? ".
> > 
> > mpiexec -hostfile ./hostfile -np 48 ./ex12 -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 -ksp_converged_reason -ksp_monitor_true_residual
> > 
> > After 1337 iterations I cancelled it, and the output was:
> 
>  You need to look at the condition number just before GMRES reaches the restart. It has to start all over again at the restart. So what was the estimated condition number at 999 iterations?
> 
>  It could be that Intel Pardiso produced a low quality solution if the matrix is ill conditioned. You can run with -ksp_type gmres -ksp_max_it 5 -ksp_monitor_true_residual with -pc_type lu to see how small the residuals are after the "direct" solver.
> 
>  Barry
> 
> 
> > 
> > 1337 KSP preconditioned resid norm 5.647402411074e-02 true resid norm 5.647402411074e-02 ||r(i)||/||b|| 3.993316540960e-03
> > 1337 KSP Residual norm 5.647402411074e-02 % max 1.070324243277e+05 min 1.220336631740e-01 max/min 8.770729448238e+05
> 
> 


   

   



  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160622/54928675/attachment.html>


More information about the petsc-users mailing list