[petsc-users] pcfieldsplit for a composite dm with multiple subfields

Gideon Simpson gideon.simpson at gmail.com
Mon Sep 7 19:49:59 CDT 2015


No problem Matt, I don’t think we had previously discussed that output.  Here is a case where things fail.

      0 SNES Function norm 4.027481756921e-09 
      1 SNES Function norm 1.760477878365e-12 
    Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
    0 SNES Function norm 5.066222213176e+03 
    1 SNES Function norm 8.484697184230e+02 
    2 SNES Function norm 6.549559723294e+02 
    3 SNES Function norm 5.770723278153e+02 
    4 SNES Function norm 5.237702240594e+02 
    5 SNES Function norm 4.753909019848e+02 
    6 SNES Function norm 4.221784590755e+02 
    7 SNES Function norm 3.806525080483e+02 
    8 SNES Function norm 3.762054656019e+02 
    9 SNES Function norm 3.758975226873e+02 
   10 SNES Function norm 3.757032042706e+02 
   11 SNES Function norm 3.728798164234e+02 
   12 SNES Function norm 3.723078741075e+02 
   13 SNES Function norm 3.721848059825e+02 
   14 SNES Function norm 3.720227575629e+02 
   15 SNES Function norm 3.720051998555e+02 
   16 SNES Function norm 3.718945430587e+02 
   17 SNES Function norm 3.700412694044e+02 
   18 SNES Function norm 3.351964889461e+02 
   19 SNES Function norm 3.096016086233e+02 
   20 SNES Function norm 3.008410789787e+02 
   21 SNES Function norm 2.752316716557e+02 
   22 SNES Function norm 2.707658474165e+02 
   23 SNES Function norm 2.698436736049e+02 
   24 SNES Function norm 2.618233857172e+02 
   25 SNES Function norm 2.600121920634e+02 
   26 SNES Function norm 2.585046423168e+02 
   27 SNES Function norm 2.568551090220e+02 
   28 SNES Function norm 2.556404537064e+02 
   29 SNES Function norm 2.536353523683e+02 
   30 SNES Function norm 2.533596070171e+02 
   31 SNES Function norm 2.532324379596e+02 
   32 SNES Function norm 2.531842335211e+02 
   33 SNES Function norm 2.531684527520e+02 
   34 SNES Function norm 2.531637604618e+02 
   35 SNES Function norm 2.531624767821e+02 
   36 SNES Function norm 2.531621359093e+02 
   37 SNES Function norm 2.531620504925e+02 
   38 SNES Function norm 2.531620350055e+02 
   39 SNES Function norm 2.531620310522e+02 
   40 SNES Function norm 2.531620300471e+02 
   41 SNES Function norm 2.531620298084e+02 
   42 SNES Function norm 2.531620297478e+02 
   43 SNES Function norm 2.531620297324e+02 
   44 SNES Function norm 2.531620297303e+02 
   45 SNES Function norm 2.531620297302e+02 
  Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 45
  0 SNES Function norm 9.636339304380e+03 
  1 SNES Function norm 8.997731184634e+03 
  2 SNES Function norm 8.120498349232e+03 
  3 SNES Function norm 7.322379894820e+03 
  4 SNES Function norm 6.599581599149e+03 
  5 SNES Function norm 6.374872854688e+03 
  6 SNES Function norm 6.372518007653e+03 
  7 SNES Function norm 6.073996314301e+03 
  8 SNES Function norm 5.635965277054e+03 
  9 SNES Function norm 5.155389064046e+03 
 10 SNES Function norm 5.080567902638e+03 
 11 SNES Function norm 5.058878643969e+03 
 12 SNES Function norm 5.058835649793e+03 
 13 SNES Function norm 5.058491285707e+03 
 14 SNES Function norm 5.057452865337e+03 
 15 SNES Function norm 5.057226140688e+03 
 16 SNES Function norm 5.056651272898e+03 
 17 SNES Function norm 5.056575190057e+03 
 18 SNES Function norm 5.056574632598e+03 
 19 SNES Function norm 5.056574520229e+03 
 20 SNES Function norm 5.056574492569e+03 
 21 SNES Function norm 5.056574485124e+03 
 22 SNES Function norm 5.056574483029e+03 
 23 SNES Function norm 5.056574482427e+03 
 24 SNES Function norm 5.056574482302e+03 
 25 SNES Function norm 5.056574482287e+03 
 26 SNES Function norm 5.056574482282e+03 
 27 SNES Function norm 5.056574482281e+03 
Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 27
SNES Object: 1 MPI processes
  type: newtonls
  maximum iterations=50, maximum function evaluations=10000
  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=28
  total number of function evaluations=323
  total number of grid sequence refinements=2
  SNESLineSearch Object:   1 MPI processes
    type: bt
      interpolation: cubic
      alpha=1.000000e-04
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
    maximum iterations=40
  KSP Object:   1 MPI processes
    type: gmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
    left preconditioning
    using PRECONDITIONED norm type for convergence test
  PC Object:   1 MPI processes
    type: lu
      LU: out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: nd
      factor fill ratio given 0, needed 0
        Factored matrix follows:
          Mat Object:           1 MPI processes
            type: seqaij
            rows=15991, cols=15991
            package used to perform factorization: mumps
            total: nonzeros=255801, allocated nonzeros=255801
            total number of mallocs used during MatSetValues calls =0
              MUMPS run parameters:
                SYM (matrix type):                   0 
                PAR (host participation):            1 
                ICNTL(1) (output for error):         6 
                ICNTL(2) (output of diagnostic msg): 0 
                ICNTL(3) (output for global info):   0 
                ICNTL(4) (level of printing):        0 
                ICNTL(5) (input mat struct):         0 
                ICNTL(6) (matrix prescaling):        7 
                ICNTL(7) (sequentia matrix ordering):6 
                ICNTL(8) (scalling strategy):        77 
                ICNTL(10) (max num of refinements):  0 
                ICNTL(11) (error analysis):          0 
                ICNTL(12) (efficiency control):                         1 
                ICNTL(13) (efficiency control):                         0 
                ICNTL(14) (percentage of estimated workspace increase): 20 
                ICNTL(18) (input mat struct):                           0 
                ICNTL(19) (Shur complement info):                       0 
                ICNTL(20) (rhs sparse pattern):                         0 
                ICNTL(21) (somumpstion struct):                            0 
                ICNTL(22) (in-core/out-of-core facility):               0 
                ICNTL(23) (max size of memory can be allocated locally):0 
                ICNTL(24) (detection of null pivot rows):               0 
                ICNTL(25) (computation of a null space basis):          0 
                ICNTL(26) (Schur options for rhs or solution):          0 
                ICNTL(27) (experimental parameter):                     -8 
                ICNTL(28) (use parallel or sequential ordering):        1 
                ICNTL(29) (parallel ordering):                          0 
                ICNTL(30) (user-specified set of entries in inv(A)):    0 
                ICNTL(31) (factors is discarded in the solve phase):    0 
                ICNTL(33) (compute determinant):                        0 
                CNTL(1) (relative pivoting threshold):      0.01 
                CNTL(2) (stopping criterion of refinement): 1.49012e-08 
                CNTL(3) (absomumpste pivoting threshold):      0 
                CNTL(4) (vamumpse of static pivoting):         -1 
                CNTL(5) (fixation for null pivots):         0 
                RINFO(1) (local estimated flops for the elimination after analysis): 
                  [0] 1.95838e+06 
                RINFO(2) (local estimated flops for the assembly after factorization): 
                  [0]  143924 
                RINFO(3) (local estimated flops for the elimination after factorization): 
                  [0]  1.95943e+06 
                INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): 
                [0] 7 
                INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): 
                  [0] 7 
                INFO(23) (num of pivots eliminated on this processor after factorization): 
                  [0] 15991 
                RINFOG(1) (global estimated flops for the elimination after analysis): 1.95838e+06 
                RINFOG(2) (global estimated flops for the assembly after factorization): 143924 
                RINFOG(3) (global estimated flops for the elimination after factorization): 1.95943e+06 
                (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0)
                INFOG(3) (estimated real workspace for factors on all processors after analysis): 255801 
                INFOG(4) (estimated integer workspace for factors on all processors after analysis): 127874 
                INFOG(5) (estimated maximum front size in the complete tree): 11 
                INFOG(6) (number of nodes in the complete tree): 3996 
                INFOG(7) (ordering option effectively use after analysis): 6 
                INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 86 
                INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 255865 
                INFOG(10) (total integer space store the matrix factors after factorization): 127890 
                INFOG(11) (order of largest frontal matrix after factorization): 11 
                INFOG(12) (number of off-diagonal pivots): 19 
                INFOG(13) (number of delayed pivots after factorization): 8 
                INFOG(14) (number of memory compress after factorization): 0 
                INFOG(15) (number of steps of iterative refinement after solution): 0 
                INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 7 
                INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 7 
                INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 7 
                INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 7 
                INFOG(20) (estimated number of entries in the factors): 255801 
                INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 7 
                INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 7 
                INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 
                INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 
                INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 
                INFOG(28) (after factorization: number of null pivots encountered): 0
                INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 255865
                INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 5, 5
                INFOG(32) (after analysis: type of analysis done): 1
                INFOG(33) (value used for ICNTL(8)): 7
                INFOG(34) (exponent of the determinant if determinant is requested): 0
    linear system matrix = precond matrix:
    Mat Object:     1 MPI processes
      type: seqaij
      rows=15991, cols=15991
      total: nonzeros=223820, allocated nonzeros=431698
      total number of mallocs used during MatSetValues calls =15991
        using I-node routines: found 4000 nodes, limit used is 5




-gideon

> On Sep 7, 2015, at 8:40 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Mon, Sep 7, 2015 at 7:32 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> Barry,
> 
> I finally got a chance to really try using the grid sequencing within my code.  I find that, in some cases, even if it can solve successfully on the coarsest mesh, the SNES fails, usually due to a line search failure, when it tries to compute along the grid sequence.  Would you have any suggestions?
> 
> I apologize if I have asked before, but can you give me -snes_view for the solver? I could not find it in the email thread.
> 
> I would suggest trying to fiddle with the line search, or precondition it with Richardson. It would be nice to see -snes_monitor
> for the runs that fail, and then we can break down the residual into fields and look at it again (if my custom residual monitor
> does not work we can write one easily). Seeing which part of the residual does not converge is key to designing the NASM
> for the problem. I have just seen the virtuoso of this, Xiao-Chuan Cai, present it. We need better monitoring in PETSc.
> 
>   Thanks,
> 
>     Matt
>  
> -gideon
> 
>> On Aug 28, 2015, at 4:21 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
>> 
>> 
>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>>> 
>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  
>>> 
>>> One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
>>> 
>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that’s the most practical thing to do.
>> 
>>   I would do the following. Create your DM and create a SNES that will do the continuation
>> 
>>   loop over continuation parameter
>> 
>>        SNESSolve(snes,NULL,Ucoarse);
>> 
>>        if (you decide you want to see the refined solution at this continuation point) {
>>             SNESCreate(comm,&snesrefine);
>>             SNESSetDM()
>>             etc
>>             SNESSetGridSequence(snesrefine,)
>>             SNESSolve(snesrefine,0,Ucoarse);
>>             SNESGetSolution(snesrefine,&Ufine);
>>             VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
>>             SNESDestroy(snesrefine);
>>       end if
>> 
>>   end loop over continuation parameter.
>> 
>>   Barry
>> 
>>> 
>>> -gideon
>>> 
>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>>>>> 
>>>>> for( continuation parameter p = 0 to 1){
>>>>> 
>>>>> 	solve with parameter p_i using solution from p_{i-1},
>>>>> }
>>>>> 
>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>>>> 
>>>>  So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>>>> 
>>>>  If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
>>>> 
>>>> Do not use -snes_grid_sequencing  
>>>> 
>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>>>> 
>>>> Call SNESSetGridSequence()
>>>> 
>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.
>>> 
>> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150907/c4851b23/attachment-0001.html>


More information about the petsc-users mailing list