<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 17, 2019 at 8:07 AM Mark Lohry <<a href="mailto:mlohry@gmail.com">mlohry@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>So with many fewer levels, are you saying</div><div><br></div><div>  a) It takes more iterates?</div><div><br></div><div>  b) It takes the same wall clock time?</div></blockquote><div><br></div><div>Slightly more iterates but at roughly the same wall clock time. Only did a short test but the runtime difference looks like it was in the noise.</div></div></blockquote><div><br></div><div>That is not surprising. Your coarse grids were a lot smaller than you fine grid. These stagnated grids were really sparse so they are cheap. It is just not a good idea to have a crazy grid hierarchy if you can help it.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div> <div>I think you might want to switch to beefier smoothers on those lower levels if you see</div><div>more iterates.</div></div></blockquote><div><br></div><div>I was thinking the same. I just did a quick run with 2 smoother iterates per level instead of 1 and got maybe 20% performance benefit, so I'll play with that a bit more. I figure ILU(0) is already a pretty beefy smoother here especially because of the very large blocks; ILU(1) is out because of memory consumption, unless I only do it on the coarsened levels. On much stiffer problems I saw considerable benefit from doing gmres+ILU(0) for 5 iterations per level, so I'll give that a shot. </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 17, 2019 at 6:48 AM Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Thu, Oct 17, 2019 at 6:22 AM Mark Lohry via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>Hi Mark, <br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>I assume these are advection problems and smoothed aggregation does not work well.</div></blockquote><div><br></div><div>Correct, it stagnates immediately with smoothed aggregation.</div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>I think '-pc_gamg_square_graph 20' should reduce the number of levels and work better for you.</div></blockquote><div> </div><div>On the big problem it's producing 20 levels without -pc_gamg_square_graph 20; with that on it produces 6 levels. It certainly has less of the near-identical-size coarse levels, but overall convergence time is roughly the same. Any suggestion of where to go from here?</div></div></blockquote><div><br></div><div>So with many fewer levels, are you saying</div><div><br></div><div>  a) It takes more iterates?</div><div><br></div><div>  b) It takes the same wall clock time?</div><div><br></div><div>I think you might want to switch to beefier smoothers on those lower levels if you see</div><div>more iterates. Mark?</div><div><br></div><div>  Thanks,</div><div><br></div><div>    Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Original setup without pc_gamg_square_graph 20:</div><div><br></div><div>[0] PCSetUp_GAMG(): level 0) N=347149550, n data rows=5, n data cols=5, nnz/row (ave)=250, np=1920<br>[0] PCGAMGFilterGraph():    100.% nnz after filtering, with threshold 0., 50. nnz ave. (N=69429910)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square<br>[0] PCGAMGProlongator_AGG(): New grid 894786 nodes<br>[0] PCSetUp_GAMG(): 1) N=4473930, n data cols=5, nnz/row (ave)=51, 1920 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 10.1761 nnz ave. (N=894786)<br>[0] PCGAMGProlongator_AGG(): New grid 184262 nodes<br>[0] PCSetUp_GAMG(): 2) N=921310, n data cols=5, nnz/row (ave)=68, 1920 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 13.0556 nnz ave. (N=184262)<br>[0] PCGAMGProlongator_AGG(): New grid 41002 nodes<br>[0] PCSetUp_GAMG(): 3) N=205010, n data cols=5, nnz/row (ave)=72, 1920 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 10.0909 nnz ave. (N=41002)<br>[0] PCGAMGProlongator_AGG(): New grid 12587 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 20 with simple aggregation<br>[0] PCSetUp_GAMG(): 4) N=62935, n data cols=5, nnz/row (ave)=62, 960 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 5.33333 nnz ave. (N=12587)<br>[0] PCGAMGProlongator_AGG(): New grid 5811 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 40 with simple aggregation<br>[0] PCSetUp_GAMG(): 5) N=29055, n data cols=5, nnz/row (ave)=50, 640 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 3.8 nnz ave. (N=5811)<br>[0] PCGAMGProlongator_AGG(): New grid 3442 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 110 with simple aggregation<br>[0] PCSetUp_GAMG(): 6) N=17210, n data cols=5, nnz/row (ave)=40, 320 active pes<br>[0] PCGAMGFilterGraph():   100.% nnz after filtering, with threshold 0., 4.66176 nnz ave. (N=3442)<br>[0] PCGAMGProlongator_AGG(): New grid 2365 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 275 with simple aggregation<br>[0] PCSetUp_GAMG(): 7) N=11825, n data cols=5, nnz/row (ave)=34, 240 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 4.961 nnz ave. (N=2365)<br>[0] PCGAMGProlongator_AGG(): New grid 1792 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 1125 with simple aggregation<br>[0] PCSetUp_GAMG(): 8) N=8960, n data cols=5, nnz/row (ave)=28, 192 active pes<br>[0] PCGAMGFilterGraph():         100.% nnz after filtering, with threshold 0., 5.79911 nnz ave. (N=1792)<br>[0] PCGAMGProlongator_AGG(): New grid 1479 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 7395 with simple aggregation<br>[0] PCSetUp_GAMG(): 9) N=7395, n data cols=5, nnz/row (ave)=24, 160 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 4.86883 nnz ave. (N=1479)<br>[0] PCGAMGProlongator_AGG(): New grid 1378 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 6890 with simple aggregation<br>[0] PCSetUp_GAMG(): 10) N=6890, n data cols=5, nnz/row (ave)=22, 128 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 4.44702 nnz ave. (N=1378)<br>[0] PCGAMGProlongator_AGG(): New grid 1210 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 6050 with simple aggregation<br>[0] PCSetUp_GAMG(): 11) N=6050, n data cols=5, nnz/row (ave)=18, 120 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 3.64298 nnz ave. (N=1210)<br>[0] PCGAMGProlongator_AGG(): New grid 1185 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=120, neq(loc)=5925<br>[0] PCSetUp_GAMG(): 12) N=5925, n data cols=5, nnz/row (ave)=17, 120 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 3.54177 nnz ave. (N=1185)<br>[0] PCGAMGProlongator_AGG(): New grid 1165 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=120, neq(loc)=5825<br>[0] PCSetUp_GAMG(): 13) N=5825, n data cols=5, nnz/row (ave)=17, 120 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 3.5133 nnz ave. (N=1165)<br>[0] PCGAMGProlongator_AGG(): New grid 1137 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=120, neq(loc)=5685<br>[0] PCSetUp_GAMG(): 14) N=5685, n data cols=5, nnz/row (ave)=17, 120 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 3.48021 nnz ave. (N=1137)<br>[0] PCGAMGProlongator_AGG(): New grid 1097 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=120, neq(loc)=5485<br>[0] PCSetUp_GAMG(): 15) N=5485, n data cols=5, nnz/row (ave)=16, 120 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 3.3938 nnz ave. (N=1097)<br>[0] PCGAMGProlongator_AGG(): New grid 1088 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=120, neq(loc)=5440<br>[0] PCSetUp_GAMG(): 16) N=5440, n data cols=5, nnz/row (ave)=16, 120 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 3.34375 nnz ave. (N=1088)<br>[0] PCGAMGProlongator_AGG(): New grid 852 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 4260 with simple aggregation<br>[0] PCSetUp_GAMG(): 17) N=4260, n data cols=5, nnz/row (ave)=15, 80 active pes<br>[0] PCGAMGFilterGraph():        100.% nnz after filtering, with threshold 0., 3.06103 nnz ave. (N=852)<br>[0] PCGAMGProlongator_AGG(): New grid 848 nodes<br>[0] PCSetUp_GAMG(): 18) N=4240, n data cols=5, nnz/row (ave)=15, 80 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 3.0566 nnz ave. (N=848)<br>[0] PCGAMGProlongator_AGG(): New grid 3 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 15 with simple aggregation<br>[0] PCSetUp_GAMG(): 19) N=15, n data cols=5, nnz/row (ave)=11, 1 active pes<br>[0] PCSetUp_GAMG(): 20 levels, grid complexity = 1.00367</div><div><br></div><div>With pc_gamg_square_graph 20:<br></div><div><br></div><div><br></div><div>[0] PCSetUp_GAMG(): level 0) N=347149550, n data rows=5, n data cols=5, nnz/row (ave)=250, np=1920</div>[0] PCGAMGFilterGraph():   100.% nnz after filtering, with threshold 0., 50. nnz ave. (N=69429910)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 20 to square<br>[0] PCGAMGProlongator_AGG(): New grid 894786 nodes<br>[0] PCSetUp_GAMG(): 1) N=4473930, n data cols=5, nnz/row (ave)=51, 1920 active pes<br>[0] PCGAMGFilterGraph():     100.% nnz after filtering, with threshold 0., 10.1761 nnz ave. (N=894786)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 2 of 20 to square<br>[0] PCGAMGProlongator_AGG(): New grid 49106 nodes<br>[0] PCSetUp_GAMG(): 2) N=245530, n data cols=5, nnz/row (ave)=80, 1920 active pes<br>[0] PCGAMGFilterGraph():     100.% nnz after filtering, with threshold 0., 14.8 nnz ave. (N=49106)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 3 of 20 to square<br>[0] PCGAMGProlongator_AGG(): New grid 1646 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 0 with simple aggregation<br>[0] PCSetUp_GAMG(): 3) N=8230, n data cols=5, nnz/row (ave)=86, 160 active pes<br>[0] PCGAMGFilterGraph():          100.% nnz after filtering, with threshold 0., 11.5 nnz ave. (N=1646)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 4 of 20 to square<br>[0] PCGAMGProlongator_AGG(): New grid 56 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 0 with simple aggregation<br>[0] PCSetUp_GAMG(): 4) N=280, n data cols=5, nnz/row (ave)=62, 6 active pes<br>[0] PCGAMGFilterGraph():        100.% nnz after filtering, with threshold 0., 12.5714 nnz ave. (N=56)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 5 of 20 to square<br>[0] PCGAMGProlongator_AGG(): New grid 4 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 20 with simple aggregation<br>[0] PCSetUp_GAMG(): 5) N=20, n data cols=5, nnz/row (ave)=17, 1 active pes<br>[0] PCSetUp_GAMG(): 6 levels, grid complexity = 1.00291<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 16, 2019 at 9:46 PM Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">The block size refers to the number of dofs/vertex, so you want 5. (I have no idea what is going on with block size set to 20).<div><br></div><div>This is better but also smaller. 10 levels is a lot a levels.</div><div><br></div><div>This is unsmoothed aggregation. I assume these are advection problems and smoothed aggregation does not work well. This is not in my wheelhouse. I think '-pc_gamg_square_graph 20' should reduce the number of levels and work better for you.</div><div><br></div><div>Thanks,</div><div>Mark</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 16, 2019 at 8:59 PM Mark Lohry <<a href="mailto:mlohry@gmail.com" target="_blank">mlohry@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi Mark, are you referring to how on the coarser levels the coarsening rate seems to nearly flatline? i.e. level 2 has 4,260 rows while level 1 has 4,240 rows? I was curious about that too...</div><div><br></div><div>Not sure if this is the cause, but I have gone back and forth on what blocksize I set; I'm doing high order elements with 5 coupled equations, so the true block size in that case is 50x50. For that I had played with setting block size to either 5 (number of equations) or 50 (actual block size) and seemed to have seen a meager 20% improvement with the block size at 5, so I kind of left it there.</div><div><br></div><div>Running a much smaller variant of the same problem at lower order (block size 20 instead of 50), the -info grep you asked for is below. I'll get -info for the much larger case but it'll take a couple days.</div><div><br></div><div>For options I'm running <br>-snes_lag_jacobian 10000 -ksp_gmres_restart 100 -pc_gamg_agg_nsmooths 0 -mg_levels_ksp_type richardson -mg_levels_pc_type asm -mg_levels_ksp_max_it 1 <br>-pc_mg_cycle_type v -snes_linesearch_type bt -snes_linesearch_order 3 <br>-snes_linesearch_monitor -mg_levels_sub_pc_factor_in_place true -info</div><div><br></div><div><br></div><div>block size 5 :</div><div><br></div><div>[0] PCSetUp_GAMG(): level 0) N=2006480, n data rows=5, n data cols=5, nnz/row (ave)=100, np=16<br>[0] PCGAMGFilterGraph():          100.% nnz after filtering, with threshold 0., 20. nnz ave. (N=401296)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square<br>[0] PCGAMGProlongator_AGG(): New grid 12947 nodes<br>[0] PCSetUp_GAMG(): 1) N=64735, n data cols=5, nnz/row (ave)=51, 16 active pes<br>[0] PCGAMGFilterGraph():     100.% nnz after filtering, with threshold 0., 10.3351 nnz ave. (N=12947)<br>[0] PCGAMGProlongator_AGG(): New grid 2671 nodes<br>[0] PCSetUp_GAMG(): 2) N=13355, n data cols=5, nnz/row (ave)=66, 16 active pes<br>[0] PCGAMGFilterGraph():    100.% nnz after filtering, with threshold 0., 12.5524 nnz ave. (N=2671)<br>[0] PCGAMGProlongator_AGG(): New grid 598 nodes<br>[0] PCSetUp_GAMG(): 3) N=2990, n data cols=5, nnz/row (ave)=65, 16 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 12.7727 nnz ave. (N=598)<br>[0] PCGAMGProlongator_AGG(): New grid 178 nodes<br>[0] PCSetUp_GAMG(): 4) N=890, n data cols=5, nnz/row (ave)=52, 16 active pes<br>[0] PCGAMGFilterGraph():         100.% nnz after filtering, with threshold 0., 8.28571 nnz ave. (N=178)<br>[0] PCGAMGProlongator_AGG(): New grid 80 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 30 with simple aggregation<br>[0] PCSetUp_GAMG(): 5) N=400, n data cols=5, nnz/row (ave)=34, 8 active pes<br>[0] PCGAMGFilterGraph():       100.% nnz after filtering, with threshold 0., 5.77778 nnz ave. (N=80)<br>[0] PCGAMGProlongator_AGG(): New grid 50 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 30 with simple aggregation<br>[0] PCSetUp_GAMG(): 6) N=250, n data cols=5, nnz/row (ave)=25, 4 active pes<br>[0] PCGAMGFilterGraph():        100.% nnz after filtering, with threshold 0., 4.76923 nnz ave. (N=50)<br>[0] PCGAMGProlongator_AGG(): New grid 36 nodes<br>[0] PCSetUp_GAMG(): 7) N=180, n data cols=5, nnz/row (ave)=18, 4 active pes<br>[0] PCGAMGFilterGraph():    100.% nnz after filtering, with threshold 0., 3.75 nnz ave. (N=36)<br>[0] PCGAMGProlongator_AGG(): New grid 33 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Aggregate processors noop: new_size=4, neq(loc)=90<br>[0] PCSetUp_GAMG(): 8) N=165, n data cols=5, nnz/row (ave)=18, 4 active pes<br>[0] PCGAMGFilterGraph():     100.% nnz after filtering, with threshold 0., 3.72222 nnz ave. (N=33)<br>[0] PCGAMGProlongator_AGG(): New grid 8 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 35 with simple aggregation<br>[0] PCSetUp_GAMG(): 9) N=40, n data cols=5, nnz/row (ave)=15, 1 active pes<br>[0] PCSetUp_GAMG(): 10 levels, grid complexity = 1.02237</div><div><br></div><div><br></div><div><br></div><div>block size 20:</div><div><br></div><div>[0] PCSetUp_GAMG(): level 0) N=2006480, n data rows=20, n data cols=20, nnz/row (ave)=100, np=16<br>[0] PCGAMGFilterGraph():        100.% nnz after filtering, with threshold 0., 5. nnz ave. (N=100324)<br>[0] PCGAMGCoarsen_AGG(): Square Graph on level 1 of 1 to square<br>[0] PCGAMGProlongator_AGG(): New grid 12948 nodes<br>[0] PCSetUp_GAMG(): 1) N=258960, n data cols=20, nnz/row (ave)=205, 16 active pes<br>[0] PCGAMGFilterGraph():   100.% nnz after filtering, with threshold 0., 10.2857 nnz ave. (N=12948)<br>[0] PCGAMGProlongator_AGG(): New grid 2671 nodes<br>[0] PCSetUp_GAMG(): 2) N=53420, n data cols=20, nnz/row (ave)=266, 16 active pes<br>[0] PCGAMGFilterGraph():          100.% nnz after filtering, with threshold 0., 12.5548 nnz ave. (N=2671)<br>[0] PCGAMGProlongator_AGG(): New grid 593 nodes<br>[0] PCSetUp_GAMG(): 3) N=11860, n data cols=20, nnz/row (ave)=264, 16 active pes<br>[0] PCGAMGFilterGraph():    100.% nnz after filtering, with threshold 0., 10.8519 nnz ave. (N=593)<br>[0] PCGAMGProlongator_AGG(): New grid 181 nodes<br>[0] PCSetUp_GAMG(): 4) N=3620, n data cols=20, nnz/row (ave)=214, 16 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 8.375 nnz ave. (N=181)<br>[0] PCGAMGProlongator_AGG(): New grid 79 nodes<br>[0] PCSetUp_GAMG(): 5) N=1580, n data cols=20, nnz/row (ave)=164, 16 active pes<br>[0] PCGAMGFilterGraph():         100.% nnz after filtering, with threshold 0., 8. nnz ave. (N=79)<br>[0] PCGAMGProlongator_AGG(): New grid 43 nodes<br>[0] PCSetUp_GAMG(): 6) N=860, n data cols=20, nnz/row (ave)=100, 16 active pes<br>[0] PCGAMGFilterGraph():      100.% nnz after filtering, with threshold 0., 5. nnz ave. (N=43)<br>[0] PCGAMGProlongator_AGG(): New grid 15 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 20 with simple aggregation<br>[0] PCSetUp_GAMG(): 7) N=300, n data cols=20, nnz/row (ave)=81, 8 active pes<br>[0] PCGAMGFilterGraph():    100.% nnz after filtering, with threshold 0., 2.66667 nnz ave. (N=15)<br>[0] PCGAMGProlongator_AGG(): New grid 1 nodes<br>[0] PCGAMGCreateLevel_GAMG(): Number of equations (loc) 0 with simple aggregation<br>[0] PCSetUp_GAMG(): 8) N=20, n data cols=20, nnz/row (ave)=20, 1 active pes<br>[0] PCSetUp_GAMG(): HARD stop of coarsening on level 7.  Grid too small: 1 block nodes<br>[0] PCSetUp_GAMG(): 9 levels, grid complexity = 1.35745<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 16, 2019 at 5:12 PM Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thanks Barry,<br><div>Sorry I missed this.</div><div>Mark: this problem is going crazy. The (default) coarsening parameters are terrible for you. Can run with -info, grep for GAMG and send that? And please send me the gamg parameters that you are using.</div><div>Thanks,</div><div>Mark</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 16, 2019 at 9:01 AM Smith, Barry F. via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

barry/2019-10-15/bug-gamg-complexity/maint  <a href="https://gitlab.com/petsc/petsc/merge_requests/2179" rel="noreferrer" target="_blank">https://gitlab.com/petsc/petsc/merge_requests/2179</a><br>

<br>

<br>

<br>

> On Oct 16, 2019, at 5:29 AM, Mark Lohry <<a href="mailto:mlohry@gmail.com" target="_blank">mlohry@gmail.com</a>> wrote:<br>

> <br>

> Well that was a quick late night bug fix. Thanks Barry, I'll try it out.<br>

> <br>

> Just to confirm: You are running with with default double precision numbers and have used the configure option --with-64-bit-indices ?<br>

> <br>

> Double precision floats, but 32 bit indices. I realize I'm playing with fire here, but I'm bumping very close to available memory limits at this scale and 64 bit indices tips me over. I figure integer index overflows would probably show a catastrophic failure, but all output looks sane.<br>

> <br>

> I see you are using MATMFFD as the operator and MPIAIJ as the matrix from which to build the preconditioner? This is not suppose to cause any difficulties since the complexity computation code uses the second matrix, that is the MPAIJ matrix to get the complexity information.<br>

> <br>

> Right, I'm using MATMFFD for the operator, and using a snes_lag_jacobian with SNESComputeJacobianDefaultColor for the matrix used to build to preconditioner. The actual behavior is exactly what I'd expect from smaller runs and the results look good, so it sounds like what you describe.<br>

> <br>

> On Wed, Oct 16, 2019 at 12:17 AM Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>> wrote:<br>

> <br>

>    I think I now see the bug: the code uses PetscInt       lev, nnz0 = -1; which will overflow. It should be using PetscLogDouble for nnz0<br>

> <br>

>   You can try changing that one place in the code and see that it now prints a reasonable value for complexity. <br>

> <br>

>   I will prepare a MR for maint to fix the bug permanently.<br>

> <br>

>   Barry<br>

> <br>

> <br>

> static PetscErrorCode PCMGGetGridComplexity(PC pc, PetscReal *gc)<br>

> {<br>

>   PetscErrorCode ierr;<br>

>   PC_MG          *mg      = (PC_MG*)pc->data;<br>

>   PC_MG_Levels   **mglevels = mg->levels;<br>

>   PetscInt       lev, nnz0 = -1;<br>

>   MatInfo        info;<br>

>   PetscFunctionBegin;<br>

>   if (!mg->nlevels) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_PLIB,"MG has no levels");<br>

>   for (lev=0, *gc=0; lev<mg->nlevels; lev++) {<br>

>     Mat dB;<br>

>     ierr = KSPGetOperators(mglevels[lev]->smoothd,NULL,&dB);CHKERRQ(ierr);<br>

>     ierr = MatGetInfo(dB,MAT_GLOBAL_SUM,&info);CHKERRQ(ierr); /* global reduction */<br>

>     *gc += (PetscReal)info.nz_used;<br>

>     if (lev==mg->nlevels-1) nnz0 = info.nz_used;<br>

>   }<br>

>   if (nnz0) *gc /= (PetscReal)nnz0;<br>

>   else *gc = 0;<br>

>   PetscFunctionReturn(0);<br>

> }<br>

> <br>

> <br>

> <br>

> > On Oct 15, 2019, at 11:11 PM, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>> wrote:<br>

> > <br>

> > <br>

> >   Mark,<br>

> > <br>

> >   It may be caused by some overflow in the calculations somewhere due to your very large sizes and nonzeros but I could not see anything based on a quick inspection of the code. We seem to use double to store the counts which normally would be more than sufficient to hold the results without overflow. Unless somewhere there is a mistaken use of int that causes a problem.<br>

> > <br>

> >   Just to confirm: You are running with with default double precision numbers and have used the configure option --with-64-bit-indices ? <br>

> > <br>

> >   I see you are using MATMFFD as the operator and MPIAIJ as the matrix from which to build the preconditioner? This is not suppose to cause any difficulties since the complexity computation code uses the second matrix, that is the MPAIJ matrix to get the complexity information. <br>

> > <br>

> >   There is definitely a bug but I am hard pressed to suggest how to find it since it seems only to be expressed in your giant runs. <br>

> > <br>

> >  Barry<br>

> > <br>

> > <br>

> > <br>

> > <br>

> > <br>

> >> On Oct 15, 2019, at 9:16 PM, Mark Lohry via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br>

> >> <br>

> >> I'm running some larger unsteady problems and trying to eek out some better GAMG performance. As is, at very small time steps, ASM preconditioner with ILU(0) is maybe 20% more efficient than my naive GAMG setup, which gives me hope that some tuning of GAMG can give some advantage. Convergence overall seems quite good, and light years better than ASM/ILU at larger time steps.<br>

> >> <br>

> >> So looking through the manual and see a note that "grid complexity should be well under 2.0 and preferably around 1.3 or lower". I check ksp_view and see:<br>

> >> Complexity:    grid = -40.5483<br>

> >> <br>

> >> Is something funny happening here?<br>

> >> <br>

> >> Pasting whole -ksp_view below:<br>

> >> <br>

> >> KSP Object: 1920 MPI processes<br>

> >>  type: fgmres<br>

> >>    restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>

> >>    happy breakdown tolerance 1e-30<br>

> >>  maximum iterations=30, initial guess is zero<br>

> >>  tolerances:  relative=0.0001, absolute=1e-06, divergence=10.<br>

> >>  right preconditioning<br>

> >>  using UNPRECONDITIONED norm type for convergence test<br>

> >> PC Object: 1920 MPI processes<br>

> >>  type: gamg<br>

> >>    type is MULTIPLICATIVE, levels=20 cycles=v<br>

> >>      Cycles per PCApply=1<br>

> >>      Using externally compute Galerkin coarse grid matrices<br>

> >>      GAMG specific options<br>

> >>        Threshold for dropping small values in graph on each level =   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  <br>

> >>        Threshold scaling factor for each level not specified = 1.<br>

> >>        AGG specific options<br>

> >>          Symmetric graph false<br>

> >>          Number of levels to square graph 1<br>

> >>          Number smoothing steps 0<br>

> >>        Complexity:    grid = -40.5483<br>

> >>  Coarse grid solver -- level -------------------------------<br>

> >>    KSP Object: (mg_coarse_) 1920 MPI processes<br>

> >>      type: preonly<br>

> >>      maximum iterations=10000, initial guess is zero<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_coarse_) 1920 MPI processes<br>

> >>      type: bjacobi<br>

> >>        number of blocks = 1920<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_coarse_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=1, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_coarse_sub_) 1 MPI processes<br>

> >>        type: lu<br>

> >>          out-of-place factorization<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]<br>

> >>          matrix ordering: nd<br>

> >>          factor fill ratio given 5., needed 1.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=15, cols=15, bs=5<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=175, allocated nonzeros=175<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 3 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=15, cols=15, bs=5<br>

> >>          total: nonzeros=175, allocated nonzeros=175<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 3 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=15, cols=15, bs=5<br>

> >>        total: nonzeros=175, allocated nonzeros=175<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 3 nodes, limit used is 5<br>

> >>  Down solver (pre-smoother) on level 1 -------------------------------<br>

> >>    KSP Object: (mg_levels_1_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_1_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_1_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_1_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=4240, cols=4240<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=64800, allocated nonzeros=64800<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 848 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=4240, cols=4240<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=64800, allocated nonzeros=64800<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 848 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=4240, cols=4240, bs=5<br>

> >>        total: nonzeros=64800, allocated nonzeros=64800<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 848 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 2 -------------------------------<br>

> >>    KSP Object: (mg_levels_2_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_2_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_2_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_2_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=4260, cols=4260<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=65200, allocated nonzeros=65200<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 852 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=4260, cols=4260<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=65200, allocated nonzeros=65200<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 852 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=4260, cols=4260, bs=5<br>

> >>        total: nonzeros=65200, allocated nonzeros=65200<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 852 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 3 -------------------------------<br>

> >>    KSP Object: (mg_levels_3_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_3_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_3_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_3_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=5440, cols=5440<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=90950, allocated nonzeros=90950<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1088 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=5440, cols=5440<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=90950, allocated nonzeros=90950<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1088 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=5440, cols=5440, bs=5<br>

> >>        total: nonzeros=90950, allocated nonzeros=90950<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 1088 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 4 -------------------------------<br>

> >>    KSP Object: (mg_levels_4_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_4_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_4_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_4_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=5485, cols=5485<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=93075, allocated nonzeros=93075<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1097 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=5485, cols=5485<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=93075, allocated nonzeros=93075<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1097 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=5485, cols=5485, bs=5<br>

> >>        total: nonzeros=93075, allocated nonzeros=93075<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 1097 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 5 -------------------------------<br>

> >>    KSP Object: (mg_levels_5_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_5_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_5_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_5_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=5685, cols=5685<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=98925, allocated nonzeros=98925<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1137 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=5685, cols=5685<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=98925, allocated nonzeros=98925<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1137 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=5685, cols=5685, bs=5<br>

> >>        total: nonzeros=98925, allocated nonzeros=98925<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 1137 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 6 -------------------------------<br>

> >>    KSP Object: (mg_levels_6_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_6_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_6_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_6_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=5825, cols=5825<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=102325, allocated nonzeros=102325<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1165 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=5825, cols=5825<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=102325, allocated nonzeros=102325<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1165 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=5825, cols=5825, bs=5<br>

> >>        total: nonzeros=102325, allocated nonzeros=102325<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 1165 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 7 -------------------------------<br>

> >>    KSP Object: (mg_levels_7_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_7_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_7_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_7_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=5925, cols=5925<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=104925, allocated nonzeros=104925<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1185 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=5925, cols=5925<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=104925, allocated nonzeros=104925<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1185 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=5925, cols=5925, bs=5<br>

> >>        total: nonzeros=104925, allocated nonzeros=104925<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 1185 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 8 -------------------------------<br>

> >>    KSP Object: (mg_levels_8_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_8_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_8_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_8_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=6050, cols=6050<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=110200, allocated nonzeros=110200<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1210 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=6050, cols=6050<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=110200, allocated nonzeros=110200<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1210 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=6050, cols=6050, bs=5<br>

> >>        total: nonzeros=110200, allocated nonzeros=110200<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 1210 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 9 -------------------------------<br>

> >>    KSP Object: (mg_levels_9_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_9_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_9_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_9_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=6890, cols=6890<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=153200, allocated nonzeros=153200<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1378 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=6890, cols=6890<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=153200, allocated nonzeros=153200<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1378 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=6890, cols=6890, bs=5<br>

> >>        total: nonzeros=153200, allocated nonzeros=153200<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 1378 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 10 -------------------------------<br>

> >>    KSP Object: (mg_levels_10_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_10_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_10_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_10_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=7395, cols=7395<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=180025, allocated nonzeros=180025<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1479 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=7395, cols=7395<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=180025, allocated nonzeros=180025<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1479 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=7395, cols=7395, bs=5<br>

> >>        total: nonzeros=180025, allocated nonzeros=180025<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 1479 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 11 -------------------------------<br>

> >>    KSP Object: (mg_levels_11_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_11_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_11_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_11_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=8960, cols=8960<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=259800, allocated nonzeros=259800<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 1792 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=8960, cols=8960<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=259800, allocated nonzeros=259800<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 1792 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=8960, cols=8960, bs=5<br>

> >>        total: nonzeros=259800, allocated nonzeros=259800<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 1792 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 12 -------------------------------<br>

> >>    KSP Object: (mg_levels_12_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_12_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_12_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_12_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=1795, cols=1795<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=33275, allocated nonzeros=33275<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 359 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=1795, cols=1795<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=33275, allocated nonzeros=33275<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 359 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=11825, cols=11825, bs=5<br>

> >>        total: nonzeros=403125, allocated nonzeros=403125<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 359 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 13 -------------------------------<br>

> >>    KSP Object: (mg_levels_13_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_13_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_13_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_13_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=340, cols=340<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=3500, allocated nonzeros=3500<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 68 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=340, cols=340<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=3500, allocated nonzeros=3500<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 68 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=17210, cols=17210, bs=5<br>

> >>        total: nonzeros=696850, allocated nonzeros=696850<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 68 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 14 -------------------------------<br>

> >>    KSP Object: (mg_levels_14_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_14_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_14_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_14_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=125, cols=125<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=625, allocated nonzeros=625<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 25 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=125, cols=125<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=625, allocated nonzeros=625<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 25 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=29055, cols=29055, bs=5<br>

> >>        total: nonzeros=1475675, allocated nonzeros=1475675<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 25 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 15 -------------------------------<br>

> >>    KSP Object: (mg_levels_15_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_15_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_15_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_15_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=45, cols=45<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=225, allocated nonzeros=225<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 9 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=45, cols=45<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=225, allocated nonzeros=225<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 9 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=62935, cols=62935, bs=5<br>

> >>        total: nonzeros=3939025, allocated nonzeros=3939025<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 9 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 16 -------------------------------<br>

> >>    KSP Object: (mg_levels_16_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_16_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_16_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_16_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=55, cols=55<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=725, allocated nonzeros=725<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 11 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=55, cols=55<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=725, allocated nonzeros=725<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 11 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=205010, cols=205010, bs=5<br>

> >>        total: nonzeros=14780300, allocated nonzeros=14780300<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using scalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 11 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 17 -------------------------------<br>

> >>    KSP Object: (mg_levels_17_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_17_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_17_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_17_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=360, cols=360<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=14350, allocated nonzeros=14350<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 72 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=360, cols=360<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=14350, allocated nonzeros=14350<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 72 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=921310, cols=921310, bs=5<br>

> >>        total: nonzeros=63203300, allocated nonzeros=63203300<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using scalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 72 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 18 -------------------------------<br>

> >>    KSP Object: (mg_levels_18_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_18_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_18_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_18_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=2130, cols=2130<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=87950, allocated nonzeros=87950<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 426 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=2130, cols=2130<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=87950, allocated nonzeros=87950<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 426 nodes, limit used is 5<br>

> >>      linear system matrix = precond matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=4473930, cols=4473930, bs=5<br>

> >>        total: nonzeros=232427300, allocated nonzeros=232427300<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using nonscalable MatPtAP() implementation<br>

> >>          using I-node (on process 0) routines: found 426 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  Down solver (pre-smoother) on level 19 -------------------------------<br>

> >>    KSP Object: (mg_levels_19_) 1920 MPI processes<br>

> >>      type: richardson<br>

> >>        damping factor=1.<br>

> >>      maximum iterations=1, nonzero initial guess<br>

> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>      left preconditioning<br>

> >>      using NONE norm type for convergence test<br>

> >>    PC Object: (mg_levels_19_) 1920 MPI processes<br>

> >>      type: asm<br>

> >>        total subdomain blocks = 1920, amount of overlap = 0<br>

> >>        restriction/interpolation type - RESTRICT<br>

> >>        Local solve is same for all blocks, in the following KSP and PC objects:<br>

> >>      KSP Object: (mg_levels_19_sub_) 1 MPI processes<br>

> >>        type: preonly<br>

> >>        maximum iterations=10000, initial guess is zero<br>

> >>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.<br>

> >>        left preconditioning<br>

> >>        using NONE norm type for convergence test<br>

> >>      PC Object: (mg_levels_19_sub_) 1 MPI processes<br>

> >>        type: ilu<br>

> >>          in-place factorization<br>

> >>          0 levels of fill<br>

> >>          tolerance for zero pivot 2.22045e-14<br>

> >>          matrix ordering: natural<br>

> >>          factor fill ratio given 0., needed 0.<br>

> >>            Factored matrix follows:<br>

> >>              Mat Object: 1 MPI processes<br>

> >>                type: seqaij<br>

> >>                rows=179050, cols=179050<br>

> >>                package used to perform factorization: petsc<br>

> >>                total: nonzeros=42562500, allocated nonzeros=42562500<br>

> >>                total number of mallocs used during MatSetValues calls =0<br>

> >>                  using I-node routines: found 35810 nodes, limit used is 5<br>

> >>        linear system matrix = precond matrix:<br>

> >>        Mat Object: 1 MPI processes<br>

> >>          type: seqaij<br>

> >>          rows=179050, cols=179050<br>

> >>          package used to perform factorization: petsc<br>

> >>          total: nonzeros=42562500, allocated nonzeros=42562500<br>

> >>          total number of mallocs used during MatSetValues calls =0<br>

> >>            using I-node routines: found 35810 nodes, limit used is 5<br>

> >>      linear system matrix followed by preconditioner matrix:<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mffd<br>

> >>        rows=347149550, cols=347149550<br>

> >>          Matrix-free approximation:<br>

> >>            err=1.49012e-08 (relative error in function evaluation)<br>

> >>            Using wp compute h routine<br>

> >>                Does not compute normU<br>

> >>      Mat Object: 1920 MPI processes<br>

> >>        type: mpiaij<br>

> >>        rows=347149550, cols=347149550, bs=5<br>

> >>        total: nonzeros=86758607500, allocated nonzeros=86758607500<br>

> >>        total number of mallocs used during MatSetValues calls =0<br>

> >>          using I-node (on process 0) routines: found 35810 nodes, limit used is 5<br>

> >>  Up solver (post-smoother) same as down solver (pre-smoother)<br>

> >>  linear system matrix followed by preconditioner matrix:<br>

> >>  Mat Object: 1920 MPI processes<br>

> >>    type: mffd<br>

> >>    rows=347149550, cols=347149550<br>

> >>      Matrix-free approximation:<br>

> >>        err=1.49012e-08 (relative error in function evaluation)<br>

> >>        Using wp compute h routine<br>

> >>            Does not compute normU<br>

> >>  Mat Object: 1920 MPI processes<br>

> >>    type: mpiaij<br>

> >>    rows=347149550, cols=347149550, bs=5<br>

> >>    total: nonzeros=86758607500, allocated nonzeros=86758607500<br>

> >>    total number of mallocs used during MatSetValues calls =0<br>

> >>      using I-node (on process 0) routines: found 35810 nodes, limit used is 5<br>

> >>        Line search: Using full step: fnorm 2.025875581923e+03 gnorm 2.801672254495e+00<br>

> >>    1 SNES Function norm 2.801672254495e+00<br>

> > <br>

> <br>

<br>

</blockquote></div>

</blockquote></div>

</blockquote></div>

</blockquote></div>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>

</blockquote></div>

</blockquote></div></div>