<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">No problem Matt, I don’t think we had previously discussed that output.  Here is a case where things fail.<div class=""><br class=""></div><div class=""><div class="">      0 SNES Function norm 4.027481756921e-09 </div><div class="">      1 SNES Function norm 1.760477878365e-12 </div><div class="">    Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1</div><div class="">    0 SNES Function norm 5.066222213176e+03 </div><div class="">    1 SNES Function norm 8.484697184230e+02 </div><div class="">    2 SNES Function norm 6.549559723294e+02 </div><div class="">    3 SNES Function norm 5.770723278153e+02 </div><div class="">    4 SNES Function norm 5.237702240594e+02 </div><div class="">    5 SNES Function norm 4.753909019848e+02 </div><div class="">    6 SNES Function norm 4.221784590755e+02 </div><div class="">    7 SNES Function norm 3.806525080483e+02 </div><div class="">    8 SNES Function norm 3.762054656019e+02 </div><div class="">    9 SNES Function norm 3.758975226873e+02 </div><div class="">   10 SNES Function norm 3.757032042706e+02 </div><div class="">   11 SNES Function norm 3.728798164234e+02 </div><div class="">   12 SNES Function norm 3.723078741075e+02 </div><div class="">   13 SNES Function norm 3.721848059825e+02 </div><div class="">   14 SNES Function norm 3.720227575629e+02 </div><div class="">   15 SNES Function norm 3.720051998555e+02 </div><div class="">   16 SNES Function norm 3.718945430587e+02 </div><div class="">   17 SNES Function norm 3.700412694044e+02 </div><div class="">   18 SNES Function norm 3.351964889461e+02 </div><div class="">   19 SNES Function norm 3.096016086233e+02 </div><div class="">   20 SNES Function norm 3.008410789787e+02 </div><div class="">   21 SNES Function norm 2.752316716557e+02 </div><div class="">   22 SNES Function norm 2.707658474165e+02 </div><div class="">   23 SNES Function norm 2.698436736049e+02 </div><div class="">   24 SNES Function norm 2.618233857172e+02 </div><div class="">   25 SNES Function norm 2.600121920634e+02 </div><div class="">   26 SNES Function norm 2.585046423168e+02 </div><div class="">   27 SNES Function norm 2.568551090220e+02 </div><div class="">   28 SNES Function norm 2.556404537064e+02 </div><div class="">   29 SNES Function norm 2.536353523683e+02 </div><div class="">   30 SNES Function norm 2.533596070171e+02 </div><div class="">   31 SNES Function norm 2.532324379596e+02 </div><div class="">   32 SNES Function norm 2.531842335211e+02 </div><div class="">   33 SNES Function norm 2.531684527520e+02 </div><div class="">   34 SNES Function norm 2.531637604618e+02 </div><div class="">   35 SNES Function norm 2.531624767821e+02 </div><div class="">   36 SNES Function norm 2.531621359093e+02 </div><div class="">   37 SNES Function norm 2.531620504925e+02 </div><div class="">   38 SNES Function norm 2.531620350055e+02 </div><div class="">   39 SNES Function norm 2.531620310522e+02 </div><div class="">   40 SNES Function norm 2.531620300471e+02 </div><div class="">   41 SNES Function norm 2.531620298084e+02 </div><div class="">   42 SNES Function norm 2.531620297478e+02 </div><div class="">   43 SNES Function norm 2.531620297324e+02 </div><div class="">   44 SNES Function norm 2.531620297303e+02 </div><div class="">   45 SNES Function norm 2.531620297302e+02 </div><div class="">  Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 45</div><div class="">  0 SNES Function norm 9.636339304380e+03 </div><div class="">  1 SNES Function norm 8.997731184634e+03 </div><div class="">  2 SNES Function norm 8.120498349232e+03 </div><div class="">  3 SNES Function norm 7.322379894820e+03 </div><div class="">  4 SNES Function norm 6.599581599149e+03 </div><div class="">  5 SNES Function norm 6.374872854688e+03 </div><div class="">  6 SNES Function norm 6.372518007653e+03 </div><div class="">  7 SNES Function norm 6.073996314301e+03 </div><div class="">  8 SNES Function norm 5.635965277054e+03 </div><div class="">  9 SNES Function norm 5.155389064046e+03 </div><div class=""> 10 SNES Function norm 5.080567902638e+03 </div><div class=""> 11 SNES Function norm 5.058878643969e+03 </div><div class=""> 12 SNES Function norm 5.058835649793e+03 </div><div class=""> 13 SNES Function norm 5.058491285707e+03 </div><div class=""> 14 SNES Function norm 5.057452865337e+03 </div><div class=""> 15 SNES Function norm 5.057226140688e+03 </div><div class=""> 16 SNES Function norm 5.056651272898e+03 </div><div class=""> 17 SNES Function norm 5.056575190057e+03 </div><div class=""> 18 SNES Function norm 5.056574632598e+03 </div><div class=""> 19 SNES Function norm 5.056574520229e+03 </div><div class=""> 20 SNES Function norm 5.056574492569e+03 </div><div class=""> 21 SNES Function norm 5.056574485124e+03 </div><div class=""> 22 SNES Function norm 5.056574483029e+03 </div><div class=""> 23 SNES Function norm 5.056574482427e+03 </div><div class=""> 24 SNES Function norm 5.056574482302e+03 </div><div class=""> 25 SNES Function norm 5.056574482287e+03 </div><div class=""> 26 SNES Function norm 5.056574482282e+03 </div><div class=""> 27 SNES Function norm 5.056574482281e+03 </div><div class="">Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 27</div><div class="">SNES Object: 1 MPI processes</div><div class="">  type: newtonls</div><div class="">  maximum iterations=50, maximum function evaluations=10000</div><div class="">  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08</div><div class="">  total number of linear solver iterations=28</div><div class="">  total number of function evaluations=323</div><div class="">  total number of grid sequence refinements=2</div><div class="">  SNESLineSearch Object:   1 MPI processes</div><div class="">    type: bt</div><div class="">      interpolation: cubic</div><div class="">      alpha=1.000000e-04</div><div class="">    maxstep=1.000000e+08, minlambda=1.000000e-12</div><div class="">    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08</div><div class="">    maximum iterations=40</div><div class="">  KSP Object:   1 MPI processes</div><div class="">    type: gmres</div><div class="">      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div class="">      GMRES: happy breakdown tolerance 1e-30</div><div class="">    maximum iterations=10000, initial guess is zero</div><div class="">    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div class="">    left preconditioning</div><div class="">    using PRECONDITIONED norm type for convergence test</div><div class="">  PC Object:   1 MPI processes</div><div class="">    type: lu</div><div class="">      LU: out-of-place factorization</div><div class="">      tolerance for zero pivot 2.22045e-14</div><div class="">      matrix ordering: nd</div><div class="">      factor fill ratio given 0, needed 0</div><div class="">        Factored matrix follows:</div><div class="">          Mat Object:           1 MPI processes</div><div class="">            type: seqaij</div><div class="">            rows=15991, cols=15991</div><div class="">            package used to perform factorization: mumps</div><div class="">            total: nonzeros=255801, allocated nonzeros=255801</div><div class="">            total number of mallocs used during MatSetValues calls =0</div><div class="">              MUMPS run parameters:</div><div class="">                SYM (matrix type):                   0 </div><div class="">                PAR (host participation):            1 </div><div class="">                ICNTL(1) (output for error):         6 </div><div class="">                ICNTL(2) (output of diagnostic msg): 0 </div><div class="">                ICNTL(3) (output for global info):   0 </div><div class="">                ICNTL(4) (level of printing):        0 </div><div class="">                ICNTL(5) (input mat struct):         0 </div><div class="">                ICNTL(6) (matrix prescaling):        7 </div><div class="">                ICNTL(7) (sequentia matrix ordering):6 </div><div class="">                ICNTL(8) (scalling strategy):        77 </div><div class="">                ICNTL(10) (max num of refinements):  0 </div><div class="">                ICNTL(11) (error analysis):          0 </div><div class="">                ICNTL(12) (efficiency control):                         1 </div><div class="">                ICNTL(13) (efficiency control):                         0 </div><div class="">                ICNTL(14) (percentage of estimated workspace increase): 20 </div><div class="">                ICNTL(18) (input mat struct):                           0 </div><div class="">                ICNTL(19) (Shur complement info):                       0 </div><div class="">                ICNTL(20) (rhs sparse pattern):                         0 </div><div class="">                ICNTL(21) (somumpstion struct):                            0 </div><div class="">                ICNTL(22) (in-core/out-of-core facility):               0 </div><div class="">                ICNTL(23) (max size of memory can be allocated locally):0 </div><div class="">                ICNTL(24) (detection of null pivot rows):               0 </div><div class="">                ICNTL(25) (computation of a null space basis):          0 </div><div class="">                ICNTL(26) (Schur options for rhs or solution):          0 </div><div class="">                ICNTL(27) (experimental parameter):                     -8 </div><div class="">                ICNTL(28) (use parallel or sequential ordering):        1 </div><div class="">                ICNTL(29) (parallel ordering):                          0 </div><div class="">                ICNTL(30) (user-specified set of entries in inv(A)):    0 </div><div class="">                ICNTL(31) (factors is discarded in the solve phase):    0 </div><div class="">                ICNTL(33) (compute determinant):                        0 </div><div class="">                CNTL(1) (relative pivoting threshold):      0.01 </div><div class="">                CNTL(2) (stopping criterion of refinement): 1.49012e-08 </div><div class="">                CNTL(3) (absomumpste pivoting threshold):      0 </div><div class="">                CNTL(4) (vamumpse of static pivoting):         -1 </div><div class="">                CNTL(5) (fixation for null pivots):         0 </div><div class="">                RINFO(1) (local estimated flops for the elimination after analysis): </div><div class="">                  [0] 1.95838e+06 </div><div class="">                RINFO(2) (local estimated flops for the assembly after factorization): </div><div class="">                  [0]  143924 </div><div class="">                RINFO(3) (local estimated flops for the elimination after factorization): </div><div class="">                  [0]  1.95943e+06 </div><div class="">                INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): </div><div class="">                [0] 7 </div><div class="">                INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): </div><div class="">                  [0] 7 </div><div class="">                INFO(23) (num of pivots eliminated on this processor after factorization): </div><div class="">                  [0] 15991 </div><div class="">                RINFOG(1) (global estimated flops for the elimination after analysis): 1.95838e+06 </div><div class="">                RINFOG(2) (global estimated flops for the assembly after factorization): 143924 </div><div class="">                RINFOG(3) (global estimated flops for the elimination after factorization): 1.95943e+06 </div><div class="">                (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0)</div><div class="">                INFOG(3) (estimated real workspace for factors on all processors after analysis): 255801 </div><div class="">                INFOG(4) (estimated integer workspace for factors on all processors after analysis): 127874 </div><div class="">                INFOG(5) (estimated maximum front size in the complete tree): 11 </div><div class="">                INFOG(6) (number of nodes in the complete tree): 3996 </div><div class="">                INFOG(7) (ordering option effectively use after analysis): 6 </div><div class="">                INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 86 </div><div class="">                INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 255865 </div><div class="">                INFOG(10) (total integer space store the matrix factors after factorization): 127890 </div><div class="">                INFOG(11) (order of largest frontal matrix after factorization): 11 </div><div class="">                INFOG(12) (number of off-diagonal pivots): 19 </div><div class="">                INFOG(13) (number of delayed pivots after factorization): 8 </div><div class="">                INFOG(14) (number of memory compress after factorization): 0 </div><div class="">                INFOG(15) (number of steps of iterative refinement after solution): 0 </div><div class="">                INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 7 </div><div class="">                INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 7 </div><div class="">                INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 7 </div><div class="">                INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 7 </div><div class="">                INFOG(20) (estimated number of entries in the factors): 255801 </div><div class="">                INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 7 </div><div class="">                INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 7 </div><div class="">                INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 </div><div class="">                INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 </div><div class="">                INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 </div><div class="">                INFOG(28) (after factorization: number of null pivots encountered): 0</div><div class="">                INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 255865</div><div class="">                INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 5, 5</div><div class="">                INFOG(32) (after analysis: type of analysis done): 1</div><div class="">                INFOG(33) (value used for ICNTL(8)): 7</div><div class="">                INFOG(34) (exponent of the determinant if determinant is requested): 0</div><div class="">    linear system matrix = precond matrix:</div><div class="">    Mat Object:     1 MPI processes</div><div class="">      type: seqaij</div><div class="">      rows=15991, cols=15991</div><div class="">      total: nonzeros=223820, allocated nonzeros=431698</div><div class="">      total number of mallocs used during MatSetValues calls =15991</div><div class="">        using I-node routines: found 4000 nodes, limit used is 5</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div class="">

<span class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px;">-gideon</span>


</div>

<br class=""><div><blockquote type="cite" class=""><div class="">On Sep 7, 2015, at 8:40 PM, Matthew Knepley <<a href="mailto:knepley@gmail.com" class="">knepley@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" style="font-family: Helvetica; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="gmail_extra"><div class="gmail_quote">On Mon, Sep 7, 2015 at 7:32 PM, Gideon Simpson<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:gideon.simpson@gmail.com" target="_blank" class="">gideon.simpson@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div style="word-wrap: break-word;" class="">Barry,<div class=""><br class=""></div><div class="">I finally got a chance to really try using the grid sequencing within my code.  I find that, in some cases, even if it can solve successfully on the coarsest mesh, the SNES fails, usually due to a line search failure, when it tries to compute along the grid sequence.  Would you have any suggestions?</div></div></blockquote><div class=""><br class=""></div><div class="">I apologize if I have asked before, but can you give me -snes_view for the solver? I could not find it in the email thread.</div><div class=""><br class=""></div><div class="">I would suggest trying to fiddle with the line search, or precondition it with Richardson. It would be nice to see -snes_monitor</div><div class="">for the runs that fail, and then we can break down the residual into fields and look at it again (if my custom residual monitor</div><div class="">does not work we can write one easily). Seeing which part of the residual does not converge is key to designing the NASM</div><div class="">for the problem. I have just seen the virtuoso of this, Xiao-Chuan Cai, present it. We need better monitoring in PETSc.</div><div class=""><br class=""></div><div class="">  Thanks,</div><div class=""><br class=""></div><div class="">    Matt</div><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div class=""><span class="HOEnZb"><font color="#888888" class="">-gideon</font></span><div class=""><div class="h5"><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 28, 2015, at 4:21 PM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov" target="_blank" class="">bsmith@mcs.anl.gov</a>> wrote:</div><br class=""><div class=""><br class=""><blockquote type="cite" class="">On Aug 28, 2015, at 3:04 PM, Gideon Simpson <<a href="mailto:gideon.simpson@gmail.com" target="_blank" class="">gideon.simpson@gmail.com</a>> wrote:<br class=""><br class="">Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  <br class=""><br class="">One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.<br class=""><br class="">The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that’s the most practical thing to do.<br class=""></blockquote><br class="">  I would do the following. Create your DM and create a SNES that will do the continuation<br class=""><br class="">  loop over continuation parameter<br class=""><br class="">       SNESSolve(snes,NULL,Ucoarse);<br class=""><br class="">       if (you decide you want to see the refined solution at this continuation point) {<br class="">            SNESCreate(comm,&snesrefine);<br class="">            SNESSetDM()<br class="">            etc<br class="">            SNESSetGridSequence(snesrefine,)<br class="">            SNESSolve(snesrefine,0,Ucoarse);<br class="">            SNESGetSolution(snesrefine,&Ufine);<br class="">            VecView(Ufine or do whatever you want to do with the Ufine at that continuation point<br class="">            SNESDestroy(snesrefine);<br class="">      end if<br class=""><br class="">  end loop over continuation parameter.<br class=""><br class="">  Barry<br class=""><br class=""><blockquote type="cite" class=""><br class="">-gideon<br class=""><br class=""><blockquote type="cite" class="">On Aug 28, 2015, at 3:55 PM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov" target="_blank" class="">bsmith@mcs.anl.gov</a>> wrote:<br class=""><br class=""><blockquote type="cite" class=""><br class=""><br class="">3.  This problem is actually part of a continuation problem that roughly looks like this<span class="Apple-converted-space"> </span><br class=""><br class="">for( continuation parameter p = 0 to 1){<br class=""><br class=""><span style="white-space: pre-wrap;" class="">    </span>solve with parameter p_i using solution from p_{i-1},<br class="">}<br class=""><br class="">What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.<br class=""></blockquote><br class=""> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).<br class=""><br class=""> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work:<span class="Apple-converted-space"> </span><br class=""><br class="">Do not use -snes_grid_sequencing  <br class=""><br class="">Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.<br class=""><br class="">Call SNESSetGridSequence()<br class=""><br class="">Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.<br class=""></blockquote><br class=""></blockquote><br class=""></div></blockquote></div><br class=""></div></div></div></div></blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div>--<span class="Apple-converted-space"> </span><br class=""><div class="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br class="">-- Norbert Wiener</div></div></div></div></blockquote></div><br class=""></div></div></body></html>