<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div>  I mean something very simple, track the number of iterations for each linear solve, how much do they jump when new matrix entries are introduced but the interpolation is kept the same. If they jump a lot then trigger the generation of a new interpolation for the next new matrix entries. Of course, the parameter "a lot" is tricky. But say one does ten solves with the same matrix and the iterations vary by a few percent, now one does a new nonzero entries and the iterations triple, that indicates a new interpolation is likely desirable. <div class=""><br class=""></div><div class="">  I agree the design space is large, a) when do you use the exact same preconditioner, b) when do you rebuild just the fine grid Chebshev eigen estimates, c) when do you rebuild the coarse matrices, d) when do you rebuild the interpolation, e) when do you rebuild the coarse aggregations? And it depends on the work needed for each each of these things, but since the code is running it has this work information; it knows how long each of these actions take.</div><div class=""><br class=""></div><div class=""> Expecting all users to have the knowledge you have of algebraic multigrid to tune is not practical, having no decision making in the code when to change these things leaves an enormous amount of performance on the table. A lot of the nonlinear PDE world does not "need" to use linear solvers, they will only use them when the expected benefits outweigh the increase in solution time and if the linear solves take too damn long due to lack of good tuning they will do something else. In other words each user should not need an algebraic multigrid expert consulting with them on every step of their project.</div><div class=""><br class=""></div><div class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Sep 18, 2022, at 12:25 PM, Mark Adams <<a href="mailto:mfadams@lbl.gov" class="">mfadams@lbl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">You really can't monitor from the solver with just algebraic information. Problems often get harder, some much harder, as they evolve and the solver has no access to that information.</div><div class="">In fact the application does not even have access to that information in the sense of what is the effect of their evolution on even the condition number. And the condition number is not useful because AMG can deal with some things better than others.</div><div class="">For example, plasticity is hard on AMG but I don't think our (very crude) strength of connection methods can "see" plasticity.</div><div class="">(a stronger smoother would be more effective)</div><div class=""><br class=""></div><div class="">Now if you do want to monitor then you have to do ML because if re-coarsening can ever make sense then you would learn from your problem and, not that it would ever really be useful, it is at least a plausible use of ML.</div><div class="">People have worked on using ML for solver parameters (eg, Keyes had a PD doing this in 2004).</div><div class="">And, this would be a way to get someone rewarded/motivated as it is easy to publish anything with ML.</div><div class=""><br class=""></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Sep 18, 2022 at 11:12 AM Barry Smith <<a href="mailto:bsmith@petsc.dev" class="">bsmith@petsc.dev</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class=""><div class=""><br class=""></div>  I think it should default to true but it definitely should monitor itself, that is measure the convergence for new solves with the old interpolation but new matrix values. It can report this number and (eventually) be self-adaptive.<div class=""><br class=""></div><div class="">   The problem for self-monitoring is that the PCGAMG is inside the KSP (and thus doesn't know about the behavior of the KSP that is calling it) so we need to introduce a new mechanism to allow a PC to have information about how it is affecting the KSP that calls it. Requires some thought for a good API. Separation of concerns in software is generally a good thing, but this is an example of where having clear separation of concerns between the PC and KSP makes doing a good thing more difficult because in the original KSP/PC design I didn't think about this type of concern.</div><div class=""><br class=""></div><div class=""><br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Sep 17, 2022, at 10:12 PM, Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank" class="">mfadams@lbl.gov</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><br class=""></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Sep 17, 2022 at 5:52 PM Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank" class="">bsmith@petsc.dev</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class=""><div class=""><br class=""></div>  It is essentially doing AMR each time-step, and for the given application, I don't think that is pathological. It is an app built on Randy LeVeque's Clawpack stuff. The linear solver totally dominates the time which makes users of Clawpack very hesitant to consider methods with implicit steps, even when they need them. One linear solution takes far longer than the entire explicit step including its sub-grid cycling and the adaptive changes to the grid at each time-step.<div class=""><br class=""></div><div class="">  There is sub-grid cycling where similar solves are done on the same mesh (hence same nonzero pattern) two or three times, so thanks, we will try -pc_gamg_reuse_interpolation true and it could potentially help a good amount.</div><div class=""><br class=""></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class=""><div class=""></div><div class="">  There could be a mode where -pc_gamg_reuse_interpolation true is on by default, and the KSP/PC monitors the performance (convergence rate) of the solve following the reuse to decide if sticking with the old interpolation is ok or if a new interpolation should be done for the next change in the matrix values. Thus not requiring user knowledge and tuning of this option which most users who just want to get on with their work would not want to mess with.</div></div></blockquote><div class=""><br class=""></div><div class="">I don't want to get complicated, just look at the default.</div><div class=""><div class="">One could give the user finer grain control over the scheduling, but I just think it would confuse users and would be hard to understand.</div><div class=""><div class="">I could see, in theory making this a lag integer instead of a bool, but this has never come up (see below) so it is not worth the API churn now.<br class=""></div><div class=""><br class=""></div></div></div><div class="">My original thinking was that since AMG adapts to the matrix it makes sense to redo the space each time mathematically, but after thinking about it and living with it, the default should be TRUE.</div><div class="">I've never seen a problem that changes so much with time that the coarse grid space is improved dramatically from reconstructing the spaces/aggregates.</div><div class=""><div class="">If we get a user that says AMG convergence is slowing down during a run for no discernible reason then tell them to use FALSE and see if convergence improves.</div></div><div class="">AMG coarsening is really not very precise anyway, based on heuristics that are easily defeated, so it is just overkill to redo it.<br class=""></div><div class="">If a user wants to add some heuristic then they can just delete/reset the solver. I don't think there is any significant overhead from doing that.</div><div class="">(ie, we could just remove this parameter always reuse and let the user reset manually and maybe that is a good idea now that I think about it)</div><div class=""><br class=""></div><div class="">I use TRUE in all examples, in the hope that users will copy my input decks, but that did not happen here.</div><div class="">I have looked over my other recommended parameters and none of them should obviously have a different default.</div><div class=""><br class=""></div><div class="">I have added this to my current MR but I am now thinking that it might be better to remove this flag and just tell users to reset the KSP/PC if they want to redo it.</div><div class="">This has bugged me for a while and this user just reminded me and I have a little cleanup MR going....</div><div class=""><br class=""></div><div class="">What do you think about just removing this and always reuse?</div><div class=""><br class=""></div><div class="">Mark</div><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class=""><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Sep 17, 2022, at 1:43 PM, Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank" class="">mfadams@lbl.gov</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class="">I don't see a problem here other than the network looks bad relative to the problem size.<div class=""><br class=""></div><div class="">All the graph methods (PCGAMGCreateG and MIS) are 2x slower.</div><div class="">  - THe symmetrization must be in PCGAMGCreateG.</div><div class="">  - MIS is pretty old code (the algorithm and original code are 25 years old)</div><div class="">RAPs are about the same.</div><div class="">KSPGMRESOrthog and MatMult are nowhere near perfect.<br class=""></div><div class=""><br class=""></div><div class="">The graph setup work gets amortized by (most) applications and benchmarkers that know how to benchmark, so it is not highly engineered like the RAP and MatMult.</div><div class="">Note, this application is building the graph work for every linear solve.</div><div class="">I am guessing they want '-pc_gamg_reuse_interpolation true' or are doing a single step/stage TS with a linear problem and AMR every time step, which would be pretty pathological.</div><div class="">I'm doing an MR right now, maybe I should change the default for -pc_gamg_reuse_interpolation?</div><div class=""><br class=""></div><div class="">Mark</div><div class=""><br class=""></div><div class=""><br class=""></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Sep 17, 2022 at 10:12 AM Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank" class="">bsmith@petsc.dev</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class=""><div class=""><br class=""></div>  Sure, but have you ever seen such a large jump in time in going from one to two MPI ranks, and are there any algorithms to do the aggregation that would not require this very expensive parallel symmetrization?<br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Sep 17, 2022, at 9:07 AM, Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank" class="">mfadams@lbl.gov</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class="">Symetrix graph make a transpose and then adds them.<div class="">I imagine adding two different matrices is expensive.</div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 16, 2022 at 8:30 PM Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank" class="">bsmith@petsc.dev</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br class="">

Mark,<br class="">

<br class="">

   I have runs of GAMG on one and two ranks with -pc_gamg_symmetrize_graph because the matrix is far from symmetric and some of GAMG is taking a huge amount more time with 2 ranks than one. (While other stuff like VecNorm shows improvement with two ranks). I've attached the two files<br class="">

<br class="">

  Have you seen this before, is there anything that can be done about? If going to two ranks causes almost a doubling in GAMG setup time that makes using parallelism not useful,<br class="">

<br class="">

  Barry<br class="">

<br class="">

</blockquote></div>

</div></blockquote></div><br class=""></div></blockquote></div>

</div></blockquote></div><br class=""></div></div></blockquote></div></div>

</div></blockquote></div><br class=""></div></div></blockquote></div></div>

</div></blockquote></div><br class=""></div></body></html>