[petsc-users] pcfieldsplit for a composite dm with multiple subfields
Barry Smith
bsmith at mcs.anl.gov
Mon Sep 7 20:22:06 CDT 2015
Hmm,
Ok you can try running it directly in the debugger since it is one process, type
gdb ./blowup_batch_refine
then
when the debugger comes up (if it does not cut and paste all output and send it)
run -on_error_abort -snes_mf_operator and any other options you normally use
Barry
> On Sep 7, 2015, at 8:18 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>
> Running with that flag gives me this:
>
> [0]PETSC ERROR: PETSC: Attaching gdb to ./blowup_batch_refine of pid 16111 on gs_air
> Unable to start debugger: No such file or directory
>
>
>
> -gideon
>
>> On Sep 7, 2015, at 9:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>
>> This should not happen. Run with a debug version of PETSc installed and the option -start_in_debugger noxterm Once the debugger starts up type cont and when it crashes type where or bt Send all output
>>
>>
>>
>> Barry
>>
>>
>>> On Sep 7, 2015, at 8:09 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>
>>> I’m getting an error with -snes_mf_operator,
>>>
>>> 0 SNES Function norm 1.421454390131e-02
>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>> [0]PETSC ERROR: to get more information on the crash.
>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [0]PETSC ERROR: Signal received
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown
>>> [0]PETSC ERROR: ./blowup_batch_refine on a arch-macports named gs_air by gideon Mon Sep 7 21:08:19 2015
>>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>>> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file
>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>>
>>> -gideon
>>>
>>>> On Sep 7, 2015, at 9:01 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>
>>>> My guess is the Jacobian is not correct (or correct "enough"), hence PETSc SNES is generating a poor descent direction. You can try
>>>> -snes_mf_operator -ksp_monitor_true residual as additional arguments. What happens?
>>>>
>>>> Barry
>>>>
>>>>
>>>>
>>>>> On Sep 7, 2015, at 7:49 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>
>>>>> No problem Matt, I don’t think we had previously discussed that output. Here is a case where things fail.
>>>>>
>>>>> 0 SNES Function norm 4.027481756921e-09
>>>>> 1 SNES Function norm 1.760477878365e-12
>>>>> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
>>>>> 0 SNES Function norm 5.066222213176e+03
>>>>> 1 SNES Function norm 8.484697184230e+02
>>>>> 2 SNES Function norm 6.549559723294e+02
>>>>> 3 SNES Function norm 5.770723278153e+02
>>>>> 4 SNES Function norm 5.237702240594e+02
>>>>> 5 SNES Function norm 4.753909019848e+02
>>>>> 6 SNES Function norm 4.221784590755e+02
>>>>> 7 SNES Function norm 3.806525080483e+02
>>>>> 8 SNES Function norm 3.762054656019e+02
>>>>> 9 SNES Function norm 3.758975226873e+02
>>>>> 10 SNES Function norm 3.757032042706e+02
>>>>> 11 SNES Function norm 3.728798164234e+02
>>>>> 12 SNES Function norm 3.723078741075e+02
>>>>> 13 SNES Function norm 3.721848059825e+02
>>>>> 14 SNES Function norm 3.720227575629e+02
>>>>> 15 SNES Function norm 3.720051998555e+02
>>>>> 16 SNES Function norm 3.718945430587e+02
>>>>> 17 SNES Function norm 3.700412694044e+02
>>>>> 18 SNES Function norm 3.351964889461e+02
>>>>> 19 SNES Function norm 3.096016086233e+02
>>>>> 20 SNES Function norm 3.008410789787e+02
>>>>> 21 SNES Function norm 2.752316716557e+02
>>>>> 22 SNES Function norm 2.707658474165e+02
>>>>> 23 SNES Function norm 2.698436736049e+02
>>>>> 24 SNES Function norm 2.618233857172e+02
>>>>> 25 SNES Function norm 2.600121920634e+02
>>>>> 26 SNES Function norm 2.585046423168e+02
>>>>> 27 SNES Function norm 2.568551090220e+02
>>>>> 28 SNES Function norm 2.556404537064e+02
>>>>> 29 SNES Function norm 2.536353523683e+02
>>>>> 30 SNES Function norm 2.533596070171e+02
>>>>> 31 SNES Function norm 2.532324379596e+02
>>>>> 32 SNES Function norm 2.531842335211e+02
>>>>> 33 SNES Function norm 2.531684527520e+02
>>>>> 34 SNES Function norm 2.531637604618e+02
>>>>> 35 SNES Function norm 2.531624767821e+02
>>>>> 36 SNES Function norm 2.531621359093e+02
>>>>> 37 SNES Function norm 2.531620504925e+02
>>>>> 38 SNES Function norm 2.531620350055e+02
>>>>> 39 SNES Function norm 2.531620310522e+02
>>>>> 40 SNES Function norm 2.531620300471e+02
>>>>> 41 SNES Function norm 2.531620298084e+02
>>>>> 42 SNES Function norm 2.531620297478e+02
>>>>> 43 SNES Function norm 2.531620297324e+02
>>>>> 44 SNES Function norm 2.531620297303e+02
>>>>> 45 SNES Function norm 2.531620297302e+02
>>>>> Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 45
>>>>> 0 SNES Function norm 9.636339304380e+03
>>>>> 1 SNES Function norm 8.997731184634e+03
>>>>> 2 SNES Function norm 8.120498349232e+03
>>>>> 3 SNES Function norm 7.322379894820e+03
>>>>> 4 SNES Function norm 6.599581599149e+03
>>>>> 5 SNES Function norm 6.374872854688e+03
>>>>> 6 SNES Function norm 6.372518007653e+03
>>>>> 7 SNES Function norm 6.073996314301e+03
>>>>> 8 SNES Function norm 5.635965277054e+03
>>>>> 9 SNES Function norm 5.155389064046e+03
>>>>> 10 SNES Function norm 5.080567902638e+03
>>>>> 11 SNES Function norm 5.058878643969e+03
>>>>> 12 SNES Function norm 5.058835649793e+03
>>>>> 13 SNES Function norm 5.058491285707e+03
>>>>> 14 SNES Function norm 5.057452865337e+03
>>>>> 15 SNES Function norm 5.057226140688e+03
>>>>> 16 SNES Function norm 5.056651272898e+03
>>>>> 17 SNES Function norm 5.056575190057e+03
>>>>> 18 SNES Function norm 5.056574632598e+03
>>>>> 19 SNES Function norm 5.056574520229e+03
>>>>> 20 SNES Function norm 5.056574492569e+03
>>>>> 21 SNES Function norm 5.056574485124e+03
>>>>> 22 SNES Function norm 5.056574483029e+03
>>>>> 23 SNES Function norm 5.056574482427e+03
>>>>> 24 SNES Function norm 5.056574482302e+03
>>>>> 25 SNES Function norm 5.056574482287e+03
>>>>> 26 SNES Function norm 5.056574482282e+03
>>>>> 27 SNES Function norm 5.056574482281e+03
>>>>> Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 27
>>>>> SNES Object: 1 MPI processes
>>>>> type: newtonls
>>>>> maximum iterations=50, maximum function evaluations=10000
>>>>> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>>>>> total number of linear solver iterations=28
>>>>> total number of function evaluations=323
>>>>> total number of grid sequence refinements=2
>>>>> SNESLineSearch Object: 1 MPI processes
>>>>> type: bt
>>>>> interpolation: cubic
>>>>> alpha=1.000000e-04
>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
>>>>> maximum iterations=40
>>>>> KSP Object: 1 MPI processes
>>>>> type: gmres
>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>> GMRES: happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object: 1 MPI processes
>>>>> type: lu
>>>>> LU: out-of-place factorization
>>>>> tolerance for zero pivot 2.22045e-14
>>>>> matrix ordering: nd
>>>>> factor fill ratio given 0, needed 0
>>>>> Factored matrix follows:
>>>>> Mat Object: 1 MPI processes
>>>>> type: seqaij
>>>>> rows=15991, cols=15991
>>>>> package used to perform factorization: mumps
>>>>> total: nonzeros=255801, allocated nonzeros=255801
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> MUMPS run parameters:
>>>>> SYM (matrix type): 0
>>>>> PAR (host participation): 1
>>>>> ICNTL(1) (output for error): 6
>>>>> ICNTL(2) (output of diagnostic msg): 0
>>>>> ICNTL(3) (output for global info): 0
>>>>> ICNTL(4) (level of printing): 0
>>>>> ICNTL(5) (input mat struct): 0
>>>>> ICNTL(6) (matrix prescaling): 7
>>>>> ICNTL(7) (sequentia matrix ordering):6
>>>>> ICNTL(8) (scalling strategy): 77
>>>>> ICNTL(10) (max num of refinements): 0
>>>>> ICNTL(11) (error analysis): 0
>>>>> ICNTL(12) (efficiency control): 1
>>>>> ICNTL(13) (efficiency control): 0
>>>>> ICNTL(14) (percentage of estimated workspace increase): 20
>>>>> ICNTL(18) (input mat struct): 0
>>>>> ICNTL(19) (Shur complement info): 0
>>>>> ICNTL(20) (rhs sparse pattern): 0
>>>>> ICNTL(21) (somumpstion struct): 0
>>>>> ICNTL(22) (in-core/out-of-core facility): 0
>>>>> ICNTL(23) (max size of memory can be allocated locally):0
>>>>> ICNTL(24) (detection of null pivot rows): 0
>>>>> ICNTL(25) (computation of a null space basis): 0
>>>>> ICNTL(26) (Schur options for rhs or solution): 0
>>>>> ICNTL(27) (experimental parameter): -8
>>>>> ICNTL(28) (use parallel or sequential ordering): 1
>>>>> ICNTL(29) (parallel ordering): 0
>>>>> ICNTL(30) (user-specified set of entries in inv(A)): 0
>>>>> ICNTL(31) (factors is discarded in the solve phase): 0
>>>>> ICNTL(33) (compute determinant): 0
>>>>> CNTL(1) (relative pivoting threshold): 0.01
>>>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08
>>>>> CNTL(3) (absomumpste pivoting threshold): 0
>>>>> CNTL(4) (vamumpse of static pivoting): -1
>>>>> CNTL(5) (fixation for null pivots): 0
>>>>> RINFO(1) (local estimated flops for the elimination after analysis):
>>>>> [0] 1.95838e+06
>>>>> RINFO(2) (local estimated flops for the assembly after factorization):
>>>>> [0] 143924
>>>>> RINFO(3) (local estimated flops for the elimination after factorization):
>>>>> [0] 1.95943e+06
>>>>> INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization):
>>>>> [0] 7
>>>>> INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization):
>>>>> [0] 7
>>>>> INFO(23) (num of pivots eliminated on this processor after factorization):
>>>>> [0] 15991
>>>>> RINFOG(1) (global estimated flops for the elimination after analysis): 1.95838e+06
>>>>> RINFOG(2) (global estimated flops for the assembly after factorization): 143924
>>>>> RINFOG(3) (global estimated flops for the elimination after factorization): 1.95943e+06
>>>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0)
>>>>> INFOG(3) (estimated real workspace for factors on all processors after analysis): 255801
>>>>> INFOG(4) (estimated integer workspace for factors on all processors after analysis): 127874
>>>>> INFOG(5) (estimated maximum front size in the complete tree): 11
>>>>> INFOG(6) (number of nodes in the complete tree): 3996
>>>>> INFOG(7) (ordering option effectively use after analysis): 6
>>>>> INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 86
>>>>> INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 255865
>>>>> INFOG(10) (total integer space store the matrix factors after factorization): 127890
>>>>> INFOG(11) (order of largest frontal matrix after factorization): 11
>>>>> INFOG(12) (number of off-diagonal pivots): 19
>>>>> INFOG(13) (number of delayed pivots after factorization): 8
>>>>> INFOG(14) (number of memory compress after factorization): 0
>>>>> INFOG(15) (number of steps of iterative refinement after solution): 0
>>>>> INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 7
>>>>> INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 7
>>>>> INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 7
>>>>> INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 7
>>>>> INFOG(20) (estimated number of entries in the factors): 255801
>>>>> INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 7
>>>>> INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 7
>>>>> INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0
>>>>> INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1
>>>>> INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
>>>>> INFOG(28) (after factorization: number of null pivots encountered): 0
>>>>> INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 255865
>>>>> INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 5, 5
>>>>> INFOG(32) (after analysis: type of analysis done): 1
>>>>> INFOG(33) (value used for ICNTL(8)): 7
>>>>> INFOG(34) (exponent of the determinant if determinant is requested): 0
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: 1 MPI processes
>>>>> type: seqaij
>>>>> rows=15991, cols=15991
>>>>> total: nonzeros=223820, allocated nonzeros=431698
>>>>> total number of mallocs used during MatSetValues calls =15991
>>>>> using I-node routines: found 4000 nodes, limit used is 5
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -gideon
>>>>>
>>>>>> On Sep 7, 2015, at 8:40 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>>>
>>>>>> On Mon, Sep 7, 2015 at 7:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>> Barry,
>>>>>>
>>>>>> I finally got a chance to really try using the grid sequencing within my code. I find that, in some cases, even if it can solve successfully on the coarsest mesh, the SNES fails, usually due to a line search failure, when it tries to compute along the grid sequence. Would you have any suggestions?
>>>>>>
>>>>>> I apologize if I have asked before, but can you give me -snes_view for the solver? I could not find it in the email thread.
>>>>>>
>>>>>> I would suggest trying to fiddle with the line search, or precondition it with Richardson. It would be nice to see -snes_monitor
>>>>>> for the runs that fail, and then we can break down the residual into fields and look at it again (if my custom residual monitor
>>>>>> does not work we can write one easily). Seeing which part of the residual does not converge is key to designing the NASM
>>>>>> for the problem. I have just seen the virtuoso of this, Xiao-Chuan Cai, present it. We need better monitoring in PETSc.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>> -gideon
>>>>>>
>>>>>>> On Aug 28, 2015, at 4:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.
>>>>>>>>
>>>>>>>> One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
>>>>>>>>
>>>>>>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that’s the most practical thing to do.
>>>>>>>
>>>>>>> I would do the following. Create your DM and create a SNES that will do the continuation
>>>>>>>
>>>>>>> loop over continuation parameter
>>>>>>>
>>>>>>> SNESSolve(snes,NULL,Ucoarse);
>>>>>>>
>>>>>>> if (you decide you want to see the refined solution at this continuation point) {
>>>>>>> SNESCreate(comm,&snesrefine);
>>>>>>> SNESSetDM()
>>>>>>> etc
>>>>>>> SNESSetGridSequence(snesrefine,)
>>>>>>> SNESSolve(snesrefine,0,Ucoarse);
>>>>>>> SNESGetSolution(snesrefine,&Ufine);
>>>>>>> VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
>>>>>>> SNESDestroy(snesrefine);
>>>>>>> end if
>>>>>>>
>>>>>>> end loop over continuation parameter.
>>>>>>>
>>>>>>> Barry
>>>>>>>
>>>>>>>>
>>>>>>>> -gideon
>>>>>>>>
>>>>>>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 3. This problem is actually part of a continuation problem that roughly looks like this
>>>>>>>>>>
>>>>>>>>>> for( continuation parameter p = 0 to 1){
>>>>>>>>>>
>>>>>>>>>> solve with parameter p_i using solution from p_{i-1},
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>>>>>>>>>
>>>>>>>>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>>>>>>>>>
>>>>>>>>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work:
>>>>>>>>>
>>>>>>>>> Do not use -snes_grid_sequencing
>>>>>>>>>
>>>>>>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>>>>>>>>>
>>>>>>>>> Call SNESSetGridSequence()
>>>>>>>>>
>>>>>>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>
>>
>
More information about the petsc-users
mailing list