<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thank you Jed and Barry,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
First, attached are the logs from the benchmark runs I did without (log_std.txt) and with MF method (log_mf.txt). It took me some trouble to get the -log_view to work because I'm using push and pop for the options which means that PETSc is initialized with
no argument so the command line argument was not taken into account, but I guess this is for a separate discussion.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
To answer questions about the current per-conditioners:</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<ul>
<li>I used the same pre-conditioner options as listed in my previous email when I added the -snes_mf option; I did try to remove all the PC related options at one point with the MF method but didn't see a change in runtime so I put them back in</li><li>this benchmark is for a 1D DMDA using 20 grid points; when running in 2D or 3D I switch the PC options to: -pc_type fieldsplit -fieldsplit_0_pc_type sor -fieldsplit_1_pc_type gamg -fieldsplit_1_ksp_type gmres -ksp_type fgmres -fieldsplit_1_pc_gamg_threshold
-1</li></ul>
<div>I haven't tried a Jacobi PC instead of SOR, I will run a set of more realistic runs (1D and 2D) without MF but with Jacobi and report on it next week. When you say "iterations" do you mean what is given by -ksp_monitor?</div>
<div><br>
</div>
<div>Cheers,</div>
<div><br>
</div>
<div>Sophie<br>
</div>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>De :</b> Barry Smith <bsmith@petsc.dev><br>
<b>Envoyé :</b> vendredi 28 août 2020 12:12<br>
<b>À :</b> Blondel, Sophie <sblondel@utk.edu><br>
<b>Cc :</b> petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>; xolotl-psi-development@lists.sourceforge.net <xolotl-psi-development@lists.sourceforge.net><br>
<b>Objet :</b> Re: [petsc-users] Matrix Free Method questions</font>
<div> </div>
</div>
<div class="" style="word-wrap:break-word; line-break:after-white-space">
<p><span style="color:#333333; background-color:#fceba6; border:10px solid #fceba6; font-family:Arial,sans-serif; margin-top:12px"><strong>[External Email]</strong></span></p>
<div>
<div class=""><br class="">
</div>
Sophie,
<div class=""><br class="">
</div>
<div class=""> This is exactly what i would expect. If you run with -ksp_monitor you will see the -snes_mf run takes many more iterations.</div>
<div class=""><br class="">
</div>
<div class=""> I am puzzled that the argument <span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">-pc_type fieldsplit did not stop the run since this is under normal circumstances not a viable preconditioner with -snes_mf.
Did you also remove the </span><span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">-pc_type fieldsplit</span><span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt"> argument?</span></div>
<div class=""><span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt"><br class="">
</span></div>
<div class=""><span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt"> In order to see how one can avoid forming the entire matrix and use matrix-free to do the matrix-vector but still have an effective preconditioner let's look
at what the current preconditioner options do.</span></div>
<div class=""><span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt"><br class="">
</span></div>
<div class="">
<blockquote type="cite" class="">
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
-pc_fieldsplit_detect_coupling </div>
</blockquote>
<div class=""><br class="">
</div>
creates two sub-preconditioners, the first for all the variables and the second for those that are coupled by the matrix to variables in neighboring cells Since only the smallest cluster sizes have diffusion/advection this second set contains only the cluster
size one variables.</div>
<div class=""><br class="">
</div>
<div class="">
<blockquote type="cite" class="">
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
-fieldsplit_0_pc_type sor </div>
</blockquote>
<br class="">
</div>
<div class="">Runs SOR on all the variables; you can think of this as running SOR on the reactions, it is a pretty good preconditioner for the reactions since the reactions are local, per cell.</div>
<div class=""><br class="">
</div>
<div class="">
<blockquote type="cite" class="">
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
-fieldsplit_1_pc_type redundant</div>
</blockquote>
</div>
<div class=""><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""> <br class="">
</font>
<div>This runs the default preconditioner (ILU) on just the variables that diffuse, i.e. the elliptic part. For smallish problems this is fine, for larger problems and 2d and 3d presumably you have also -redundant_pc_type gamg to use algebraic multigrid for
the diffusion. This part of the matrix will always need to be formed and used in the preconditioner. It is very important since the diffusion is what brings in most of the ill-conditioning for larger problems into the linear system. Note that it only needs
the matrix entries for the cluster size of 1 so it is very small compared to the entire sparse matrix.</div>
<div><br class="">
</div>
<div>----</div>
<div> The first preconditioner SOR requires ALL the matrix entries which are almost all (except for the diffusion terms) the coupling between different size clusters within a cell. Especially each cell has its own sparse matrix of the size of total number of
clusters, it is sparse but not super sparse.</div>
<div><br class="">
</div>
<div> So the to significantly lower memory usage we need to remove the SOR and the storing of all the matrix entries but still have an efficient preconditioner for the "reaction" terms. </div>
<div><br class="">
</div>
<div> The simplest thing would be to use Jacobi instead of SOR for the first subpreconditioner since it only requires the diagonal entries in the matrix. But Jacobi is a worse preconditioner than SOR (since it totally ignores the matrix coupling) and sometimes
can be much worse.</div>
<div><br class="">
</div>
<div> Before anyone writes additional code we need to know if doing something along these lines does not ruin the convergence that.</div>
<div><br class="">
</div>
<div> Have you used the same options as before but with <span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">-fieldsplit_0_pc_type jacobi ? (Not using any matrix free). We need to get an idea of how many more linear iterations
it requires (not time, comparing time won't be helpful for this exercise.) We also need this information for realistic size problems in 2 or 3 dimensions that you really want to run; for small problems this approach will work ok and give misleading information
about what happens for large problems.</span></div>
<div><span class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt"><br class="">
</span></div>
<div><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""> I suspect the iteration counts will shot up. Can you run some cases and see how the iteration counts change?</font></div>
<div><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""><br class="">
</font></div>
<div><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""> Based on that we can decide if we still retain "good convergence" by changing the SOR to Jacobi and then change the code to make this change efficient (basically by skipping the explicit
computation of the reaction Jacobian terms and using matrix-free on the outside of the PCFIELDSPLIT.)</font></div>
<div><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""><br class="">
</font></div>
<div><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""> Barry</font></div>
<div><font face="Calibri, Arial, Helvetica, sans-serif" size="3" class=""><br class="">
</font></div>
<div><br class="">
</div>
<div> </div>
<div><br class="">
</div>
<div> </div>
<div><br class="">
</div>
<div><br class="">
</div>
<div><br class="">
</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Aug 28, 2020, at 9:49 AM, Blondel, Sophie via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" class="">petsc-users@mcs.anl.gov</a>> wrote:</div>
<br class="x_Apple-interchange-newline">
<div class="">
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
Hi everyone,</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
I have been using PETSc for a few years with a fully implicit TS ARKIMEX method and am now exploring the matrix free method option. Here is the list of PETSc options I typically use: -ts_dt 1.0e-12 -ts_adapt_time_step_increase_delay 5 -snes_force_iteration
-ts_max_time 1000.0 -ts_adapt_dt_max 2.0e-3 -ts_adapt_wnormtype INFINITY -ts_exact_final_time stepover -fieldsplit_0_pc_type sor -ts_max_snes_failures -1 -pc_fieldsplit_detect_coupling -ts_monitor -pc_type fieldsplit -fieldsplit_1_pc_type redundant -ts_max_steps
100</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
I started to compare the performance of the code without changing anything of the executable and simply adding "-snes_mf", I see a reduction of memory usage as expected and a benchmark that would usually take ~5min to run now takes ~50min. Reading the documentation
I saw that there are a few option to play with the matrix free method like -snes_mf_err, -snes_mf_umin, or switching to -snes_mf_type wp. I used and modified the values of each of these options separately but never saw a sizable change in runtime, is it expected?</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
And are there other ways to make the matrix free method faster? I saw in the documentation that you can define your own per-conditioner for instance. Let me know if you need additional information about the PETSc setup in the application I use.<br class="">
</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
Best,</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-style:normal; font-variant-caps:normal; font-weight:normal; letter-spacing:normal; text-align:start; text-indent:0px; text-transform:none; white-space:normal; word-spacing:0px; text-decoration:none; font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
Sophie</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</div>
</body>
</html>