From mfadams at lbl.gov Wed Oct 1 06:25:14 2025 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 1 Oct 2025 07:25:14 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> Message-ID: Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. On Tue, Sep 30, 2025 at 9:27?AM Barry Smith wrote: > > Would you be able to share your code? I'm at a loss as to why we are > seeing this behavior and can much more quickly figure it out by running the > code in a debugger. > > Barry > > You can send the code petsc-maint at mcs.anl.gov if you don't want to share > the code with everyone, > > On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena < > Elena.Moral.Sanchez at ipp.mpg.de> wrote: > > This is what I get: > > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 2.249726733143e+00 > Residual norms for mg_levels_1_ solve. > 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm > 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP Residual norm 1.433120400946e+00 > 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm > 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 > 2 KSP Residual norm 1.169262560123e+00 > 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm > 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 > 3 KSP Residual norm 1.323528716607e+00 > 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm > 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 > 4 KSP Residual norm 5.006323254234e-01 > 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm > 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 > 5 KSP Residual norm 3.569836784785e-01 > 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm > 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 > 6 KSP Residual norm 2.493182937513e-01 > 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm > 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 > 7 KSP Residual norm 3.038202502298e-01 > 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm > 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 > 8 KSP Residual norm 2.780214194402e-01 > 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm > 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 > 9 KSP Residual norm 1.676826341491e-01 > 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm > 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 > 10 KSP Residual norm 1.209985378713e-01 > 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm > 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 > 11 KSP Residual norm 9.445076689969e-02 > 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm > 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 > 12 KSP Residual norm 8.308555284580e-02 > 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm > 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 > 13 KSP Residual norm 5.472865592585e-02 > 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm > 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 > 14 KSP Residual norm 4.357870564398e-02 > 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm > 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 > 15 KSP Residual norm 5.079681292439e-02 > 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm > 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 5.079681292439e-02 > Residual norms for mg_levels_1_ solve. > 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm > 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 > 1 KSP Residual norm 2.934938644003e-02 > 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm > 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 > 2 KSP Residual norm 3.257065831294e-02 > 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm > 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 > 3 KSP Residual norm 4.143063876867e-02 > 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm > 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 > 4 KSP Residual norm 4.822471409489e-02 > 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm > 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 > 5 KSP Residual norm 3.197538246153e-02 > 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm > 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 > 6 KSP Residual norm 3.461217019835e-02 > 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm > 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 > 7 KSP Residual norm 3.410193775327e-02 > 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm > 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 > 8 KSP Residual norm 4.690424294464e-02 > 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm > 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 > 9 KSP Residual norm 3.366148892800e-02 > 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm > 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 > 10 KSP Residual norm 4.068015727689e-02 > 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm > 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 > 11 KSP Residual norm 2.658836123104e-02 > 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm > 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 > 12 KSP Residual norm 2.826244186003e-02 > 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm > 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 > 13 KSP Residual norm 2.981793619508e-02 > 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm > 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 > 14 KSP Residual norm 3.525455091450e-02 > 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm > 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 > 15 KSP Residual norm 2.331539121838e-02 > 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm > 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 2.421498365806e-02 > Residual norms for mg_levels_1_ solve. > 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm > 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP Residual norm 1.761072112362e-02 > 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm > 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 > 2 KSP Residual norm 1.400842489042e-02 > 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm > 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 > 3 KSP Residual norm 1.419665483348e-02 > 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm > 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 > 4 KSP Residual norm 1.617590701667e-02 > 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm > 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 > 5 KSP Residual norm 1.354824081005e-02 > 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm > 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 > 6 KSP Residual norm 1.387252917475e-02 > 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm > 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 > 7 KSP Residual norm 1.514043102087e-02 > 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm > 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 > 8 KSP Residual norm 1.275811124745e-02 > 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm > 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 > 9 KSP Residual norm 1.241039155981e-02 > 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm > 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 > 10 KSP Residual norm 9.585207801652e-03 > 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm > 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 > 11 KSP Residual norm 9.022641230732e-03 > 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm > 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 > 12 KSP Residual norm 1.187709152046e-02 > 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm > 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 > 13 KSP Residual norm 1.084880112494e-02 > 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm > 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 > 14 KSP Residual norm 8.194750346781e-03 > 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm > 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 > 15 KSP Residual norm 7.614246199165e-03 > 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm > 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 7.614246199165e-03 > Residual norms for mg_levels_1_ solve. > 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm > 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 > 1 KSP Residual norm 5.620014684145e-03 > 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm > 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 > 2 KSP Residual norm 6.643368363907e-03 > 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm > 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 > 3 KSP Residual norm 8.708642393659e-03 > 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm > 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 > 4 KSP Residual norm 6.401852907459e-03 > 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm > 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 > 5 KSP Residual norm 7.230576215262e-03 > 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm > 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 > 6 KSP Residual norm 6.204081601285e-03 > 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm > 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 > 7 KSP Residual norm 7.038656665944e-03 > 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm > 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 > 8 KSP Residual norm 7.194079694050e-03 > 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm > 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 > 9 KSP Residual norm 6.353576889135e-03 > 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm > 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 > 10 KSP Residual norm 7.313589502731e-03 > 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm > 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 > 11 KSP Residual norm 6.643320423193e-03 > 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm > 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 > 12 KSP Residual norm 7.235443182108e-03 > 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm > 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 > 13 KSP Residual norm 4.971292307201e-03 > 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm > 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 > 14 KSP Residual norm 5.357933842147e-03 > 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm > 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 > 15 KSP Residual norm 5.841682994497e-03 > 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm > 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > > Cheers, > Elena > > ------------------------------ > *From:* Barry Smith > *Sent:* 29 September 2025 20:31:26 > *To:* Moral Sanchez, Elena > *Cc:* Mark Adams; petsc-users > *Subject:* Re: [petsc-users] setting correct tolerances for MG smoother > CG at the finest level > > > Thanks. I missed something earlier in the KSPView > > using UNPRECONDITIONED norm type for convergence test >> > > Please add the options > > -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual >> >> > It is using the unpreconditioned residual norms for convergence testing > but we are printing the preconditioned norms. > > Barry > > > On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena < > Elena.Moral.Sanchez at ipp.mpg.de> wrote: > > This is the output: > > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 2.249726733143e+00 > 1 KSP Residual norm 1.433120400946e+00 > 2 KSP Residual norm 1.169262560123e+00 > 3 KSP Residual norm 1.323528716607e+00 > 4 KSP Residual norm 5.006323254234e-01 > 5 KSP Residual norm 3.569836784785e-01 > 6 KSP Residual norm 2.493182937513e-01 > 7 KSP Residual norm 3.038202502298e-01 > 8 KSP Residual norm 2.780214194402e-01 > 9 KSP Residual norm 1.676826341491e-01 > 10 KSP Residual norm 1.209985378713e-01 > 11 KSP Residual norm 9.445076689969e-02 > 12 KSP Residual norm 8.308555284580e-02 > 13 KSP Residual norm 5.472865592585e-02 > 14 KSP Residual norm 4.357870564398e-02 > 15 KSP Residual norm 5.079681292439e-02 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 5.079681292439e-02 > 1 KSP Residual norm 2.934938644003e-02 > 2 KSP Residual norm 3.257065831294e-02 > 3 KSP Residual norm 4.143063876867e-02 > 4 KSP Residual norm 4.822471409489e-02 > 5 KSP Residual norm 3.197538246153e-02 > 6 KSP Residual norm 3.461217019835e-02 > 7 KSP Residual norm 3.410193775327e-02 > 8 KSP Residual norm 4.690424294464e-02 > 9 KSP Residual norm 3.366148892800e-02 > 10 KSP Residual norm 4.068015727689e-02 > 11 KSP Residual norm 2.658836123104e-02 > 12 KSP Residual norm 2.826244186003e-02 > 13 KSP Residual norm 2.981793619508e-02 > 14 KSP Residual norm 3.525455091450e-02 > 15 KSP Residual norm 2.331539121838e-02 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 2.421498365806e-02 > 1 KSP Residual norm 1.761072112362e-02 > 2 KSP Residual norm 1.400842489042e-02 > 3 KSP Residual norm 1.419665483348e-02 > 4 KSP Residual norm 1.617590701667e-02 > 5 KSP Residual norm 1.354824081005e-02 > 6 KSP Residual norm 1.387252917475e-02 > 7 KSP Residual norm 1.514043102087e-02 > 8 KSP Residual norm 1.275811124745e-02 > 9 KSP Residual norm 1.241039155981e-02 > 10 KSP Residual norm 9.585207801652e-03 > 11 KSP Residual norm 9.022641230732e-03 > 12 KSP Residual norm 1.187709152046e-02 > 13 KSP Residual norm 1.084880112494e-02 > 14 KSP Residual norm 8.194750346781e-03 > 15 KSP Residual norm 7.614246199165e-03 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > Residual norms for mg_levels_1_ solve. > 0 KSP Residual norm 7.614246199165e-03 > 1 KSP Residual norm 5.620014684145e-03 > 2 KSP Residual norm 6.643368363907e-03 > 3 KSP Residual norm 8.708642393659e-03 > 4 KSP Residual norm 6.401852907459e-03 > 5 KSP Residual norm 7.230576215262e-03 > 6 KSP Residual norm 6.204081601285e-03 > 7 KSP Residual norm 7.038656665944e-03 > 8 KSP Residual norm 7.194079694050e-03 > 9 KSP Residual norm 6.353576889135e-03 > 10 KSP Residual norm 7.313589502731e-03 > 11 KSP Residual norm 6.643320423193e-03 > 12 KSP Residual norm 7.235443182108e-03 > 13 KSP Residual norm 4.971292307201e-03 > 14 KSP Residual norm 5.357933842147e-03 > 15 KSP Residual norm 5.841682994497e-03 > Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 > > > ------------------------------ > *From:* Barry Smith > *Sent:* 29 September 2025 15:56:33 > *To:* Moral Sanchez, Elena > *Cc:* Mark Adams; petsc-users > *Subject:* Re: [petsc-users] setting correct tolerances for MG smoother > CG at the finest level > > > I asked you to run with > > -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason >>> -mg_levels_ksp_converged_reason >> >> > you chose not to, delaying the process of understanding what is happening. > > Please run with those options and send the output. My guess is that you > are computing the "residual norms" in your own monitor code, and it is > doing so differently than what PETSc does, thus resulting in the appearance > of a sufficiently small residual norm, whereas PETSc may not have > calculated something that small. > > Barry > > > On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena < > Elena.Moral.Sanchez at ipp.mpg.de> wrote: > > Thanks for the hint. I agree that the coarse solve should be much more > "accurate". However, for the moment I am just trying to understand what the > MG is doing exactly. > > I am puzzled to see that the fine grid smoother ("lvl 0") does not stop > when the residual becomes less than 1e-1. It should converge due to the > atol. > > ------------------------------ > *From:* Mark Adams > *Sent:* 29 September 2025 14:20:56 > *To:* Moral Sanchez, Elena > *Cc:* Barry Smith; petsc-users > *Subject:* Re: [petsc-users] setting correct tolerances for MG smoother > CG at the finest level > > Oh I see the coarse grid solver in your full solver output now. > You still want an accurate coarse grid solve. Usually (the default in > GAMG) you use a direct solver on one process, and cousin until the coarse > grid is small enough to make that cheap. > > On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena < > Elena.Moral.Sanchez at ipp.mpg.de> wrote: > >> Hi, I doubled the system size and changed the tolerances just to show a >> better example of the problem. This is the output of the callbacks in the >> first iteration: >> CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s >> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s >> ConvergedReason MG lvl 0: 4 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s >> ConvergedReason MG lvl -1: 3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 4 >> CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s >> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s >> ConvergedReason MG lvl 0: 4 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s >> ConvergedReason MG lvl -1: 3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s >> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s >> ConvergedReason MG lvl 0: 4 >> CG ConvergedReason: -3 >> >> For completeness, I add here the -ksp_view of the whole solver: >> KSP Object: 1 MPI process >> type: cg >> variant HERMITIAN >> maximum iterations=1, nonzero initial guess >> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI process >> type: mg >> type is MULTIPLICATIVE, levels=2 cycles=v >> Cycles per PCApply=1 >> Not using Galerkin computed coarse grid matrices >> Coarse grid solver -- level 0 ------------------------------- >> KSP Object: (mg_coarse_) 1 MPI process >> type: cg >> variant HERMITIAN >> maximum iterations=15, nonzero initial guess >> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: (mg_coarse_) 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: python >> rows=524, cols=524 >> Python: Solver_petsc.LeastSquaresOperator >> Down solver (pre-smoother) on level 1 >> ------------------------------- >> KSP Object: (mg_levels_1_) 1 MPI process >> type: cg >> variant HERMITIAN >> maximum iterations=15, nonzero initial guess >> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >> left preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: (mg_levels_1_) 1 MPI process >> type: none >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: python >> rows=884, cols=884 >> Python: Solver_petsc.LeastSquaresOperator >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: 1 MPI process >> type: python >> rows=884, cols=884 >> Python: Solver_petsc.LeastSquaresOperator >> >> Regarding Mark's Email: What do you mean with "the whole solver doesn't >> have a coarse grid"? I am using my own Restriction and Interpolation >> operators. >> Thanks for the help, >> Elena >> >> ------------------------------ >> *From:* Mark Adams >> *Sent:* 28 September 2025 20:13:54 >> *To:* Barry Smith >> *Cc:* Moral Sanchez, Elena; petsc-users >> *Subject:* Re: [petsc-users] setting correct tolerances for MG smoother >> CG at the finest level >> >> Not sure why your "whole"solver does not have a coarse grid but this is >> wrong: >> >> KSP Object: (mg_coarse_) 1 MPI process >> type: cg >> variant HERMITIAN >> maximum iterations=100, initial guess is zero >> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >> >> The coarse grid has to be accurate. The defaults are a good place to >> start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) >> >> >> On Fri, Sep 26, 2025 at 3:21?PM Barry Smith wrote: >> >>> Looks reasonable. Send the output running with >>> >>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason >>> -mg_levels_ksp_converged_reason >>> >>> On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena < >>> Elena.Moral.Sanchez at ipp.mpg.de> wrote: >>> >>> Dear Barry, >>> >>> This is -ksp_view for the smoother at the finest level: >>> >>> KSP Object: (mg_levels_1_) 1 MPI process >>> type: cg >>> variant HERMITIAN >>> maximum iterations=10, nonzero initial guess >>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: (mg_levels_1_) 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: python >>> rows=524, cols=524 >>> Python: Solver_petsc.LeastSquaresOperator >>> >>> And at the coarsest level: >>> >>> KSP Object: (mg_coarse_) 1 MPI process >>> type: cg >>> variant HERMITIAN >>> maximum iterations=100, initial guess is zero >>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: (mg_coarse_) 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: python >>> rows=344, cols=344 >>> Python: Solver_petsc.LeastSquaresOperator >>> >>> And for the whole solver: >>> >>> KSP Object: 1 MPI process >>> type: cg >>> variant HERMITIAN >>> maximum iterations=100, nonzero initial guess >>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: mg >>> type is MULTIPLICATIVE, levels=2 cycles=v >>> Cycles per PCApply=1 >>> Not using Galerkin computed coarse grid matrices >>> Coarse grid solver -- level 0 ------------------------------- >>> KSP Object: (mg_coarse_) 1 MPI process >>> type: cg >>> variant HERMITIAN >>> maximum iterations=100, initial guess is zero >>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: (mg_coarse_) 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: python >>> rows=344, cols=344 >>> Python: Solver_petsc.LeastSquaresOperator >>> Down solver (pre-smoother) on level 1 ------------------------------- >>> KSP Object: (mg_levels_1_) 1 MPI process >>> type: cg >>> variant HERMITIAN >>> maximum iterations=10, nonzero initial guess >>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>> left preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: (mg_levels_1_) 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: python >>> rows=524, cols=524 >>> Python: Solver_petsc.LeastSquaresOperator >>> Up solver (post-smoother) same as down solver (pre-smoother) >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: python >>> rows=524, cols=524 >>> Python: Solver_petsc.LeastSquaresOperator >>> >>> Best, >>> Elena >>> >>> ------------------------------ >>> >>> *From:* Barry Smith >>> *Sent:* 26 September 2025 19:05:02 >>> *To:* Moral Sanchez, Elena >>> *Cc:* petsc-users at mcs.anl.gov >>> *Subject:* Re: [petsc-users] setting correct tolerances for MG smoother >>> CG at the finest level >>> >>> >>> Send the output using -ksp_view >>> >>> Normally one uses a fixed number of iterations of smoothing on level >>> with multigrid rather than a tolerance, but yes PETSc should respect such a >>> tolerance. >>> >>> Barry >>> >>> >>> On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena < >>> Elena.Moral.Sanchez at ipp.mpg.de> wrote: >>> >>> Hi, >>> I am using multigrid (multiplicative) as a preconditioner with a V-cycle >>> of two levels. At each level, I am setting CG as the smoother with certain >>> tolerance. >>> >>> What I observe is that in the finest level the CG continues iterating >>> after the residual norm reaches the tolerance (atol) and it only stops when >>> reaching the maximum number of iterations at that level. At the coarsest >>> level this does not occur and the CG stops when the tolerance is reached. >>> >>> I double-checked that the smoother at the finest level has the right >>> tolerance. And I am using a Monitor function to track the residual. >>> >>> Do you know how to make the smoother at the finest level stop when >>> reaching the tolerance? >>> >>> Cheers, >>> Elena. >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Wed Oct 1 09:32:42 2025 From: lzou at anl.gov (Zou, Ling) Date: Wed, 1 Oct 2025 14:32:42 +0000 Subject: [petsc-users] Proper way for exception handling Message-ID: Hi, After updating to PETSc 3.23 (from a quite old version, ~3.8), I found that my old way of exception handling not working any more, on my Mac. I would like to learn the proper way to handle exceptions in PETSc code. Here is my pseudo code: ================================================================ PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); double my_function(); ================================================================ double my_function() { if (all_good) return 1; else { throw 199; return 0; } } PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) { double my_value = my_function(); // compute residuals } int main(int argc, char **argv) { Initialize_PETSc(); double dt = 1, dt_min = 0.001; while (dt > dt_min) { try { SNESSolve(AppCtx.snes, NULL, AppCtx.u); } catch (int err) { dt *= 0.5; } } PetscFinalize(); } ================================================================ This piece of logic used to work well, but now giving me the following error: Current time (the starting time of this time step) = 0.. NL step = 0, SNES Function norm = 8.60984E+03 libc++abi: terminating due to uncaught exception of type int Abort trap: 6 Q1: why this exception catch logic not working any more? Q2: is there any good example of PETSc exception handling I can follow? Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Wed Oct 1 10:46:48 2025 From: lzou at anl.gov (Zou, Ling) Date: Wed, 1 Oct 2025 15:46:48 +0000 Subject: [petsc-users] Proper way for exception handling In-Reply-To: References: Message-ID: Although I haven?t tried yet. Does it make sense and should it work to change the code this way? ================================================================ PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); double my_function(); ================================================================ double my_function() { if (all_good) return 1; else { throw 199; return 0; } } PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) { Try { my_value = my_function(); } Catch (int err) { return PETSC_ERR_NOT_CONVERGED; } // compute residuals return PETSC_SUCCESS; } int main(int argc, char **argv) { Initialize_PETSc(); double dt = 1, dt_min = 0.001; while (dt > dt_min) { SNESSolve(AppCtx.snes, NULL, AppCtx.u); SNESGetConvergedReason(AppCtx.snes, &(AppCtx.snes_converged_reason)); if (not_converged) dt *= 0.5; } } PetscFinalize(); } ================================================================ From: petsc-users on behalf of Zou, Ling via petsc-users Date: Wednesday, October 1, 2025 at 9:33?AM To: PETSc Subject: [petsc-users] Proper way for exception handling Hi, After updating to PETSc 3.23 (from a quite old version, ~3.8), I found that my old way of exception handling not working any more, on my Mac. I would like to learn the proper way to handle exceptions in PETSc code. Here is my pseudo code: ================================================================ PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); double my_function(); ================================================================ double my_function() { if (all_good) return 1; else { throw 199; return 0; } } PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) { double my_value = my_function(); // compute residuals } int main(int argc, char **argv) { Initialize_PETSc(); double dt = 1, dt_min = 0.001; while (dt > dt_min) { try { SNESSolve(AppCtx.snes, NULL, AppCtx.u); } catch (int err) { dt *= 0.5; } } PetscFinalize(); } ================================================================ This piece of logic used to work well, but now giving me the following error: Current time (the starting time of this time step) = 0.. NL step = 0, SNES Function norm = 8.60984E+03 libc++abi: terminating due to uncaught exception of type int Abort trap: 6 Q1: why this exception catch logic not working any more? Q2: is there any good example of PETSc exception handling I can follow? Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Oct 1 11:34:53 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 1 Oct 2025 12:34:53 -0400 Subject: [petsc-users] Proper way for exception handling In-Reply-To: References: Message-ID: <8990AD06-7221-48C2-BCCB-45C7D736760B@petsc.dev> No. Your callback function should not be returning a PETSC_ERR_NOT_CONVERGED since your function doesn't know anything about the status of the nonlinear solver and is not in the business of deciding if SNES is converging or not. What are you trying to convey back to PETSc if your function is not "all good"? There are generally two possibilities, 1) just give up and stop the entire program immediately, then return PETSC_ERR_USER 2) indicate that SNES solve is asking your function to be evaluated at a point that is not in the domain of your function. For example, say your function is the square root and x is -1. In this situation, depending on the exact nonlinear solver being used SNES may be able to continue to try to solve the nonlinear system by changing its x value. To indicate this call SNESSetFunctionDomainError(snes); and then do a usual return PETSC_SUCCESS; If SNES can continue it will, if it cannot then IT will generate a negative SNESConvergedReason indicating that SNES did not converge. Barry > On Oct 1, 2025, at 11:46?AM, Zou, Ling via petsc-users wrote: > > Although I haven?t tried yet. Does it make sense and should it work to change the code this way? > > ================================================================ > PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); > double my_function(); > ================================================================ > > double my_function() > { > if (all_good) > return 1; > else > { > throw 199; > return 0; > } > } > > PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) > { > Try > { > my_value = my_function(); > } > Catch (int err) > { > return PETSC_ERR_NOT_CONVERGED; > } > > // compute residuals > > return PETSC_SUCCESS; > } > > int main(int argc, char **argv) > { > Initialize_PETSc(); > > double dt = 1, dt_min = 0.001; > while (dt > dt_min) > { > SNESSolve(AppCtx.snes, NULL, AppCtx.u); > > SNESGetConvergedReason(AppCtx.snes, &(AppCtx.snes_converged_reason)); > > if (not_converged) > dt *= 0.5; > } > } > > PetscFinalize(); > } > ================================================================ > > > > From: petsc-users > on behalf of Zou, Ling via petsc-users > > Date: Wednesday, October 1, 2025 at 9:33?AM > To: PETSc > > Subject: [petsc-users] Proper way for exception handling > > Hi, > > After updating to PETSc 3.23 (from a quite old version, ~3.8), I found that my old way of exception handling not working any more, on my Mac. > I would like to learn the proper way to handle exceptions in PETSc code. > > Here is my pseudo code: > > ================================================================ > PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); > double my_function(); > ================================================================ > > double my_function() > { > if (all_good) > return 1; > else > { > throw 199; > return 0; > } > } > > PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) > { > double my_value = my_function(); > > // compute residuals > } > > int main(int argc, char **argv) > { > Initialize_PETSc(); > > double dt = 1, dt_min = 0.001; > while (dt > dt_min) > { > try > { > SNESSolve(AppCtx.snes, NULL, AppCtx.u); > } > catch (int err) > { > dt *= 0.5; > } > } > > PetscFinalize(); > } > ================================================================ > > This piece of logic used to work well, but now giving me the following error: > > Current time (the starting time of this time step) = 0.. > NL step = 0, SNES Function norm = 8.60984E+03 > libc++abi: terminating due to uncaught exception of type int > Abort trap: 6 > > Q1: why this exception catch logic not working any more? > Q2: is there any good example of PETSc exception handling I can follow? > > Best, > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Oct 1 11:48:37 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 1 Oct 2025 12:48:37 -0400 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> Message-ID: I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!YRnufR4azDdpfJngalubC7i6AYidmMy5jzcDkedwqFk6a7Mrf60Lo_nAXsczsIY6DHwZzmPwpii1ARUQLJiC6Q8$ Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. Barry > On Jul 31, 2025, at 6:46?AM, LEDAC Pierre wrote: > > Thanks Barry, I agree but didn't dare asking for that. > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?41 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > De : Barry Smith > > Envoy? : mercredi 30 juillet 2025 20:34:26 > ? : Junchao Zhang > Cc : LEDAC Pierre; petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] [GPU] Jacobi preconditioner > > > We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. > > I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!YRnufR4azDdpfJngalubC7i6AYidmMy5jzcDkedwqFk6a7Mrf60Lo_nAXsczsIY6DHwZzmPwpii1ARUQ1vbEovg$ > > Barry > > > > >> On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: >> >> Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. >> If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. >> >> --Junchao Zhang >> >> >> On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: >>> Hello all, >>> >>> We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). >>> >>> The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. >>> >>> It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). >>> >>> Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? >>> NB: Gmres is running well on device. >>> >>> I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. >>> >>> Thanks, >>> >>> >>> >>> >>> Pierre LEDAC >>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>> Centre de SACLAY >>> DES/ISAS/DM2S/SGLS/LCAN >>> B?timent 451 ? point courrier n?41 >>> F-91191 Gif-sur-Yvette >>> +33 1 69 08 04 03 >>> +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Wed Oct 1 13:42:26 2025 From: lzou at anl.gov (Zou, Ling) Date: Wed, 1 Oct 2025 18:42:26 +0000 Subject: [petsc-users] Proper way for exception handling In-Reply-To: <8990AD06-7221-48C2-BCCB-45C7D736760B@petsc.dev> References: <8990AD06-7221-48C2-BCCB-45C7D736760B@petsc.dev> Message-ID: Thank you, Barry. I got a working code following your suggestion. See below. -Ling From: Barry Smith Date: Wednesday, October 1, 2025 at 11:36?AM To: Zou, Ling Cc: PETSc Subject: Re: [petsc-users] Proper way for exception handling No. Your callback function should not be returning a PETSC_ERR_NOT_CONVERGED since your function doesn't know anything about the status of the nonlinear solver and is not in the business of deciding if SNES is converging or not. What are you ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd No. Your callback function should not be returning a PETSC_ERR_NOT_CONVERGED since your function doesn't know anything about the status of the nonlinear solver and is not in the business of deciding if SNES is converging or not. ? You are right. I implemented the code as proposed in the email. It did not go the way I expected. What are you trying to convey back to PETSc if your function is not "all good"? There are generally two possibilities, 1) just give up and stop the entire program immediately, then return PETSC_ERR_USER <- Not what I am looking for. 2) indicate that SNES solve is asking your function to be evaluated at a point that is not in the domain of your function. For example, say your function is the square root and x is -1. In this situation, depending on the exact nonlinear solver being used SNES may be able to continue to try to solve the nonlinear system by changing its x value. To indicate this call SNESSetFunctionDomainError(snes); and then do a usual return PETSC_SUCCESS; If SNES can continue it will, if it cannot then IT will generate a negative SNESConvergedReason indicating that SNES did not converge. <- Yes. This is what I am looking for. To provide some context, it is a flow problem solver that solves for pressure (p) and temperature (T) nonlinear variables. Density and other properties are evaluated based on p and T. In some case, p and T go out of the physical domain, e.g., negative value, equation of state package will throw an exception, instead of error out, because the code wants to try, for example, a smaller time step size. Following your suggestion, the following code works now, thank you! ================================================================ PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); double my_function(); ================================================================ double my_function() { if (all_good) // e.g., pressure > 0 return 1; else // e.g., pressure < 0 { throw 199; return 0; } } PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) { Try { my_value = my_function(); } Catch (int err) { SNESSetFunctionDomainError(snes); return PETSC_SUCCESS; } // compute residuals return PETSC_SUCCESS; } int main(int argc, char **argv) { Initialize_PETSc(); double dt = 1, dt_min = 0.001; while (dt > dt_min) { SNESSolve(AppCtx.snes, NULL, AppCtx.u); SNESGetConvergedReason(AppCtx.snes, &(AppCtx.snes_converged_reason)); if (not_converged) dt *= 0.5; } } PetscFinalize(); } ================================================================ Barry On Oct 1, 2025, at 11:46?AM, Zou, Ling via petsc-users wrote: Although I haven?t tried yet. Does it make sense and should it work to change the code this way? ================================================================ PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); double my_function(); ================================================================ double my_function() { if (all_good) return 1; else { throw 199; return 0; } } PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) { Try { my_value = my_function(); } Catch (int err) { return PETSC_ERR_NOT_CONVERGED; } // compute residuals return PETSC_SUCCESS; } int main(int argc, char **argv) { Initialize_PETSc(); double dt = 1, dt_min = 0.001; while (dt > dt_min) { SNESSolve(AppCtx.snes, NULL, AppCtx.u); SNESGetConvergedReason(AppCtx.snes, &(AppCtx.snes_converged_reason)); if (not_converged) dt *= 0.5; } } PetscFinalize(); } ================================================================ From: petsc-users > on behalf of Zou, Ling via petsc-users > Date: Wednesday, October 1, 2025 at 9:33?AM To: PETSc > Subject: [petsc-users] Proper way for exception handling Hi, After updating to PETSc 3.23 (from a quite old version, ~3.8), I found that my old way of exception handling not working any more, on my Mac. I would like to learn the proper way to handle exceptions in PETSc code. Here is my pseudo code: ================================================================ PetscErrorCode SNESFormFunction(SNES, Vec, Vec, void*); double my_function(); ================================================================ double my_function() { if (all_good) return 1; else { throw 199; return 0; } } PetscErrorCode SNESFormFunction(SNES snes, Vec u, Vec r, void* ctx) { double my_value = my_function(); // compute residuals } int main(int argc, char **argv) { Initialize_PETSc(); double dt = 1, dt_min = 0.001; while (dt > dt_min) { try { SNESSolve(AppCtx.snes, NULL, AppCtx.u); } catch (int err) { dt *= 0.5; } } PetscFinalize(); } ================================================================ This piece of logic used to work well, but now giving me the following error: Current time (the starting time of this time step) = 0.. NL step = 0, SNES Function norm = 8.60984E+03 libc++abi: terminating due to uncaught exception of type int Abort trap: 6 Q1: why this exception catch logic not working any more? Q2: is there any good example of PETSc exception handling I can follow? Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.LEDAC at cea.fr Wed Oct 1 14:46:00 2025 From: Pierre.LEDAC at cea.fr (LEDAC Pierre) Date: Wed, 1 Oct 2025 19:46:00 +0000 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr>, Message-ID: <79361faf1a834649a802772418106a78@cea.fr> Hi all, Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith Envoy? : mercredi 1 octobre 2025 18:48:37 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!ZTyB5izDEpePiYqGPLDkZ7olxZMfhLSZgmtW6Z5jYG98PpLglVDuUL4OH8YigoaUPs0y3gSSuvLiguVkFOreDMJKuleN$ Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. Barry On Jul 31, 2025, at 6:46?AM, LEDAC Pierre wrote: Thanks Barry, I agree but didn't dare asking for that. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 30 juillet 2025 20:34:26 ? : Junchao Zhang Cc : LEDAC Pierre; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!ZTyB5izDEpePiYqGPLDkZ7olxZMfhLSZgmtW6Z5jYG98PpLglVDuUL4OH8YigoaUPs0y3gSSuvLiguVkFOreDN9FB-qQ$ Barry On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. --Junchao Zhang On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: Hello all, We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? NB: Gmres is running well on device. I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.LEDAC at cea.fr Wed Oct 1 14:47:34 2025 From: Pierre.LEDAC at cea.fr (LEDAC Pierre) Date: Wed, 1 Oct 2025 19:47:34 +0000 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: <79361faf1a834649a802772418106a78@cea.fr> References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr>, , <79361faf1a834649a802772418106a78@cea.fr> Message-ID: <48fcd36fec154100b888af547764ef20@cea.fr> Sorry the correct error is: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : LEDAC Pierre Envoy? : mercredi 1 octobre 2025 21:46:00 ? : Barry Smith Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : RE: [petsc-users] [GPU] Jacobi preconditioner Hi all, Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith Envoy? : mercredi 1 octobre 2025 18:48:37 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!cxXLKRMlq3jTKtYbL1bBEwdTciFj7dVMRMOMgKNUaazN0ROkbDhYnon9tkYk5I9H_UG-wNbimZps5qhKfHovE1oEaF3z$ Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. Barry On Jul 31, 2025, at 6:46?AM, LEDAC Pierre wrote: Thanks Barry, I agree but didn't dare asking for that. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 30 juillet 2025 20:34:26 ? : Junchao Zhang Cc : LEDAC Pierre; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!cxXLKRMlq3jTKtYbL1bBEwdTciFj7dVMRMOMgKNUaazN0ROkbDhYnon9tkYk5I9H_UG-wNbimZps5qhKfHovE6QghXLS$ Barry On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. --Junchao Zhang On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: Hello all, We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? NB: Gmres is running well on device. I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Oct 1 19:16:40 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 1 Oct 2025 20:16:40 -0400 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: <48fcd36fec154100b888af547764ef20@cea.fr> References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr> Message-ID: <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. Barry > On Oct 1, 2025, at 3:47?PM, LEDAC Pierre wrote: > > Sorry the correct error is: > > /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" > GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); > > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?41 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > De : LEDAC Pierre > Envoy? : mercredi 1 octobre 2025 21:46:00 > ? : Barry Smith > Cc : Junchao Zhang; petsc-users at mcs.anl.gov > Objet : RE: [petsc-users] [GPU] Jacobi preconditioner > > Hi all, > > Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: > > /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" > GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); > > Thanks, > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?41 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > De : Barry Smith > > Envoy? : mercredi 1 octobre 2025 18:48:37 > ? : LEDAC Pierre > Cc : Junchao Zhang; petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] [GPU] Jacobi preconditioner > > > I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!dzuxOoOUatxToc7TIXOe_lHWwFJf4p7OkCsU3kBG-UwhwYKWsW7nmw9g3sZmJq9UcS1KgnPvlIFK27vJGYCgWDc$ > > Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. > > Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. > > Barry > >> On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: >> >> Thanks Barry, I agree but didn't dare asking for that. >> >> Pierre LEDAC >> Commissariat ? l??nergie atomique et aux ?nergies alternatives >> Centre de SACLAY >> DES/ISAS/DM2S/SGLS/LCAN >> B?timent 451 ? point courrier n?41 >> F-91191 Gif-sur-Yvette >> +33 1 69 08 04 03 >> +33 6 83 42 05 79 >> >> De : Barry Smith > >> Envoy? : mercredi 30 juillet 2025 20:34:26 >> ? : Junchao Zhang >> Cc : LEDAC Pierre; petsc-users at mcs.anl.gov >> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >> >> >> We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. >> >> I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!dzuxOoOUatxToc7TIXOe_lHWwFJf4p7OkCsU3kBG-UwhwYKWsW7nmw9g3sZmJq9UcS1KgnPvlIFK27vJar78Rs8$ >> >> Barry >> >> >> >> >>> On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: >>> >>> Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. >>> If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. >>> >>> --Junchao Zhang >>> >>> >>> On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: >>>> Hello all, >>>> >>>> We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). >>>> >>>> The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. >>>> >>>> It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). >>>> >>>> Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? >>>> NB: Gmres is running well on device. >>>> >>>> I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. >>>> >>>> Thanks, >>>> >>>> >>>> >>>> >>>> Pierre LEDAC >>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>> Centre de SACLAY >>>> DES/ISAS/DM2S/SGLS/LCAN >>>> B?timent 451 ? point courrier n?41 >>>> F-91191 Gif-sur-Yvette >>>> +33 1 69 08 04 03 >>>> +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.LEDAC at cea.fr Thu Oct 2 00:16:32 2025 From: Pierre.LEDAC at cea.fr (LEDAC Pierre) Date: Thu, 2 Oct 2025 05:16:32 +0000 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr>, <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> Message-ID: Yes, probably the reason I saw also a crash in my test case after a quick fix of the integer conversion. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith Envoy? : jeudi 2 octobre 2025 02:16:40 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. Barry On Oct 1, 2025, at 3:47?PM, LEDAC Pierre wrote: Sorry the correct error is: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : LEDAC Pierre Envoy? : mercredi 1 octobre 2025 21:46:00 ? : Barry Smith Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : RE: [petsc-users] [GPU] Jacobi preconditioner Hi all, Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 1 octobre 2025 18:48:37 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!fS_NaATd0yWvGRWmTdx4ZaKnDWoTB7UYN9nBRUu1UJ-BSQCojQOwpfljbSbTYpVhB_OyoTabvdWrs8rHUf4vlfDW6FlX$ Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. Barry On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: Thanks Barry, I agree but didn't dare asking for that. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 30 juillet 2025 20:34:26 ? : Junchao Zhang Cc : LEDAC Pierre; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!fS_NaATd0yWvGRWmTdx4ZaKnDWoTB7UYN9nBRUu1UJ-BSQCojQOwpfljbSbTYpVhB_OyoTabvdWrs8rHUf4vlfrA6pwj$ Barry On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. --Junchao Zhang On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: Hello all, We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? NB: Gmres is running well on device. I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Oct 3 20:53:18 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 3 Oct 2025 21:53:18 -0400 Subject: [petsc-users] PETSc 3.24 release is out Message-ID: <2B773416-C27A-4478-B705-6959F8F415C6@petsc.dev> We are pleased to announce the release of PETSc version 3.24.0 at https://urldefense.us/v3/__https://petsc.org/release/download/__;!!G_uCfscf7eWS!et_Z7k9VCtNV7Z0dgSBR_u96qZ2otELZG48Epnaybl-q7DjANzso__JV20TF-Ujqsk1AS1-CjEH6Wz4HHkJysOI$ This release introduces `PetscRegressor`, https://urldefense.us/v3/__https://petsc.org/release/manual/regressor/__;!!G_uCfscf7eWS!et_Z7k9VCtNV7Z0dgSBR_u96qZ2otELZG48Epnaybl-q7DjANzso__JV20TF-Ujqsk1AS1-CjEH6Wz4HazgEayc$ which provides some basic infrastructure and a general API for supervised machine learning tasks at a higher level of abstraction than a purely algebraic ?solvers? view. A list of the major changes and updates can be found at https://urldefense.us/v3/__https://petsc.org/release/changes/324/__;!!G_uCfscf7eWS!et_Z7k9VCtNV7Z0dgSBR_u96qZ2otELZG48Epnaybl-q7DjANzso__JV20TF-Ujqsk1AS1-CjEH6Wz4HtZ34Lq4$ The final update to petsc-3.23 i.e petsc-3.23.7 is also available We recommend upgrading to PETSc 3.24.0 soon. As always, please report problems to petsc-maint at mcs.anl.gov and ask questions at petsc-users at mcs.anl.gov This release includes contributions from Albert Cowie Alex Lindsay aszaboa Barry Smith Brad Aagaard Connor Ward Daniel Otto de Mentock danofinn Dave Martin David Salac David Wells Davi Yan Hansol Suh James Wright Jared Frazier Jasper Hatton Jed Brown Jeremy L Thompson Jonas Heinzmann josephpu Jose Roman Junchao Zhang Koki Sagiyama Lisandro Dalcin Mark Adams Martin Diehl Matthew Knepley Min RK Mr. Hong Zhang Nuno Nobre Pablo Brubeck Patrick Farrell Paul T. K?hner Pierre Jolivet Pierre Marchand Raphael Zanella Rezgar Shakeri Richard Tran Mills Rylanor Salzman Alexis Satish Balay Stefano Zampini Tapashree Pradhan Toby Isaac Victor A. P. Magri Wisang Sugiarta Zach Atkins and bug reports/proposed improvements received from Aldo Bonfiglioli Alexis Salzman Ali Ahmad Christiaan Klaij Eric Chamberland F?lix Kwok Hammond, Glenn E Jose Roman Juan Franco Lawrence Mitchell Lichtner, Peter Lisandro Dalcin PIGNET Nicolas Sebastien Gilles Steven Dargaville Victor Eijkhout Volker Jacht Zongze Yang As always, thanks for your support, Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldo.bonfiglioli at unibas.it Mon Oct 6 12:11:19 2025 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Mon, 6 Oct 2025 19:11:19 +0200 Subject: [petsc-users] Advice on creating vectors defined on lower dimensional manifolds of a DMPlex Message-ID: Dear all, what is the best approach for defining vectors that "sit" on the (vertices and/or faces) of a given stratum of the "Face Sets" of a DMPlex? > DM Object: 3D plex 1 MPI process > ?type: plex > 3D plex in 3 dimensions: > ?Number of 0-cells per rank: 9261 > ?Number of 1-cells per rank: 59660 > ?Number of 2-cells per rank: 98400 > ?Number of 3-cells per rank: 48000 > Labels: > ?marker: 1 strata with value/size (1 (14402)) > ?celltype: 4 strata with value/size (0 (9261), 1 (59660), 3 (98400), 6 > (48000)) > ?depth: 4 strata with value/size (0 (9261), 1 (59660), 2 (98400), 3 > (48000)) > ?Face Sets: 6 strata with value/size (1 (800), 2 (800), 3 (800), 4 > (800), 5 (800), 6 (800)) > These vectors are going to be used (for example) to store stresses and heat flux on solid surfaces. To be more specific: suppose stratum 3 of the "Face Sets" is a solid wall. I want to create a vector that that stores quantities computed on the (800) faces of that wall OR the vertices of that wall. Thanks, Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Mechanics Dipartimento di Ingegneria Universita' della Basilicata V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!dx5g28NqJW34oxLLKP1Fjtp65c0KkvUjelPzjza0lBJtf6uu5ROFqpa2GTX5Cle8L7S_YjHssSDqe6szXd2PEYvYVHHq5mtW8EU$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 6 13:10:00 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 6 Oct 2025 14:10:00 -0400 Subject: [petsc-users] Advice on creating vectors defined on lower dimensional manifolds of a DMPlex In-Reply-To: References: Message-ID: On Mon, Oct 6, 2025 at 1:11?PM Aldo Bonfiglioli wrote: > Dear all, > > what is the best approach for defining vectors that "sit" on the (vertices > and/or faces) of a given stratum of the "Face Sets" of a DMPlex? > > DM Object: 3D plex 1 MPI process > type: plex > 3D plex in 3 dimensions: > Number of 0-cells per rank: 9261 > Number of 1-cells per rank: 59660 > Number of 2-cells per rank: 98400 > Number of 3-cells per rank: 48000 > Labels: > marker: 1 strata with value/size (1 (14402)) > celltype: 4 strata with value/size (0 (9261), 1 (59660), 3 (98400), 6 > (48000)) > depth: 4 strata with value/size (0 (9261), 1 (59660), 2 (98400), 3 > (48000)) > Face Sets: 6 strata with value/size (1 (800), 2 (800), 3 (800), 4 (800), > 5 (800), 6 (800)) > > These vectors are going to be used (for example) to store stresses and > heat flux on solid surfaces. > > To be more specific: suppose stratum 3 of the "Face Sets" is a solid wall. > > I want to create a vector that that stores quantities computed on the > (800) faces of that wall OR the vertices of that wall. > It should be simple to just create such vectors. You request a submesh using that label DM subdm; DMLabel label, sublabel; DMGetLabel(dm, "Face Sets", &label); DMLabelDuplicate(label, &sublabel); DMPlexLabelComplete(dm, sublabel); DMPlexCreateSubmesh(dm, sublabel, 3, PETSC_TRUE, &subdm) DMLabelDestroy(&sublabel); Now you can define a PetscFE over this submesh in the same way as any other mesh. Moreover, the subdm contains a mapping back to the original DM, from which you can create a mapping of dofs, so that you can inject the subvector into a larger field if you wish. If you want to use fields on submeshes inside a PetscDS, so that the Plex manages the solve, the procedure is slightly different, but I can detail it if you want. Thanks, Matt > Thanks, > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Mechanics > Dipartimento di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!agxuKTesyLfH3uplL5jGjn6KTo6lQTqlAtD7aN9x4Nao7Q1B1yi933LeiaBDdRvOl-Eusg-tJFIjzfmXLiuH$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!agxuKTesyLfH3uplL5jGjn6KTo6lQTqlAtD7aN9x4Nao7Q1B1yi933LeiaBDdRvOl-Eusg-tJFIjzVxLgYfM$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chonglin.zhang at und.edu Mon Oct 6 14:39:22 2025 From: chonglin.zhang at und.edu (Zhang, Chonglin) Date: Mon, 6 Oct 2025 19:39:22 +0000 Subject: [petsc-users] Help with DMPlexCreateFromCellListParallelPetsc Message-ID: Dear PETSc developers, I have some questions on the proper use of DMPlexCreateFromCellListParallelPetsc function (https://urldefense.us/v3/__https://petsc.org/release/manualpages/DMPlex/DMPlexCreateFromCellListParallelPetsc/__;!!G_uCfscf7eWS!YJgdfNvYGp76m3zAsH07tAaJCaPnEyuyhZdGILGu2oyhW0RSJZeX4J8-IUCVNBupO7efr7Ju5zA_5ayjZ_YkwEeCuXjIJQ2K$ ). I am upgrading my code?s PETSc dependency from v3.16.6 to v3.24.0 (and v3.23.3, v3.23.6). I encountered crash with DMPlexCreateFromCellListParallelPetsc function: * To show the crash, I modified the following test: src/dm/impls/plex/tests/ex18.c * Is there anything I am doing wrong when creating DMPlex using the below mesh with this test? * What is the order of vertex index (of own element) going into DMPlexCreateFromCellListParallelPetsc function? * What is the order of own vertex coordinates going into DMPlexCreateFromCellListParallelPetsc function? Here is a detailed descriptions of what I did with ex18.c to show my issue: src/dm/impls/plex/tests/ex18.c * A simple 2D square mesh with 5 vertices and 4 triangles, shared by 2 MPI ranks (see the below image for the mesh, also attached file). * Each MPI rank owns 2 elements. * MPI rank 0 owns vertex indexed as: 0, 1, 3, 4; rank 1 owns vertex indexed as: 2. * Modified code (with updated mesh information) is attached. * The DM view output is also attached. * Note: the original test with 2 triangles and 2 elements was running fine. [cid:image001.jpg at 01DC36CC.D62D6600] Using this new mesh, the test crashed with the following error message: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Mesh cell 1 of type triangle is inverted, |J| = 0. [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!YJgdfNvYGp76m3zAsH07tAaJCaPnEyuyhZdGILGu2oyhW0RSJZeX4J8-IUCVNBupO7efr7Ju5zA_5ayjZ_YkwEeCuaawmeuG$ for trouble shooting. [0]PETSC ERROR: PETSc Release Version 3.24.0, unknown [0]PETSC ERROR: ./ex18 with 2 MPI process(es) and PETSC_ARCH arch-centos_kokkos on boltzmann2 by zhangc20 Mon Oct 6 14:00:48 2025 [0]PETSC ERROR: Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-shared-libraries=1 --with-debugging=no --COPTFLAGS="-g -O2 -fPIC" --CXXOPTFLAGS="-g -O2 -fPIC" --FOPTFLAGS="-g -O2 -fPIC" --with-cuda=1 --with-cuda-arch=86 --with-cudac=nvcc --with-kokkos=1 --with-kokkos-dir=./../install/kokkos/install/ --with-kokkos-kernels=1 --with-kokkos-kernels-dir=./../install/kokkos-kernels/install/ --download-metis --download-parmetis --download-fblaslapack=1 --download-triangle --with-make-np=8 PETSC_ARCH=arch-centos_kokkos [0]PETSC ERROR: #1 DMPlexCheckGeometry() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9630 [0]PETSC ERROR: #2 DMPlexCheck() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9807 [0]PETSC ERROR: #3 DMSetFromOptions_NonRefinement_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5121 [0]PETSC ERROR: #4 DMSetFromOptions_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5601 [0]PETSC ERROR: #5 DMSetFromOptions() at /hdd1/dsmc/comet/comet/petsc/src/dm/interface/dm.c:907 [0]PETSC ERROR: #6 CreateMesh() at ex18.c:811 [0]PETSC ERROR: #7 main() at ex18.c:1527 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -dm_plex_check_all 1 (source: command line) [0]PETSC ERROR: -dm_view ascii:dm_mesh.txt:ascii_info_detail (source: command line) [0]PETSC ERROR: -interpolate create (source: command line) [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 0 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Invalid argument [1]PETSC ERROR: Mesh cell 0 of type triangle is inverted, |J| = -0.25 [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!YJgdfNvYGp76m3zAsH07tAaJCaPnEyuyhZdGILGu2oyhW0RSJZeX4J8-IUCVNBupO7efr7Ju5zA_5ayjZ_YkwEeCuaawmeuG$ for trouble shooting. [1]PETSC ERROR: PETSc Release Version 3.24.0, unknown [1]PETSC ERROR: ./ex18 with 2 MPI process(es) and PETSC_ARCH arch-centos_kokkos on boltzmann2 by zhangc20 Mon Oct 6 14:00:48 2025 [1]PETSC ERROR: Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-shared-libraries=1 --with-debugging=no --COPTFLAGS="-g -O2 -fPIC" --CXXOPTFLAGS="-g -O2 -fPIC" --FOPTFLAGS="-g -O2 -fPIC" --with-cuda=1 --with-cuda-arch=86 --with-cudac=nvcc --with-kokkos=1 --with-kokkos-dir=./../install/kokkos/install/ --with-kokkos-kernels=1 --with-kokkos-kernels-dir=./../install/kokkos-kernels/install/ --download-metis --download-parmetis --download-fblaslapack=1 --download-triangle --with-make-np=8 PETSC_ARCH=arch-centos_kokkos [1]PETSC ERROR: #1 DMPlexCheckGeometry() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9630 [1]PETSC ERROR: #2 DMPlexCheck() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9807 [1]PETSC ERROR: #3 DMSetFromOptions_NonRefinement_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5121 [1]PETSC ERROR: #4 DMSetFromOptions_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5601 [1]PETSC ERROR: #5 DMSetFromOptions() at /hdd1/dsmc/comet/comet/petsc/src/dm/interface/dm.c:907 [1]PETSC ERROR: #6 CreateMesh() at ex18.c:811 [1]PETSC ERROR: #7 main() at ex18.c:1527 [1]PETSC ERROR: PETSc Option Table entries: [1]PETSC ERROR: -dm_plex_check_all 1 (source: command line) [1]PETSC ERROR: -dm_view ascii:dm_mesh.txt:ascii_info_detail (source: command line) [1]PETSC ERROR: -interpolate create (source: command line) [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 1 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 Run script used: mpirun -np 2 ./ex18 -dm_plex_check_all 1 -dm_view ascii:dm_mesh.txt:ascii_info_detail -interpolate create Thanks, Chonglin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 454644 bytes Desc: image001.jpg URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ex18.c URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dm_mesh_wrong.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PETSc_DM.jpg Type: image/jpeg Size: 454644 bytes Desc: PETSc_DM.jpg URL: From mfadams at lbl.gov Mon Oct 6 17:26:52 2025 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 6 Oct 2025 18:26:52 -0400 Subject: [petsc-users] Help with DMPlexCreateFromCellListParallelPetsc In-Reply-To: References: Message-ID: I think you need to reorder your points according to each processor. If process 0 has 4 vertices, they will be numbered (0, 1, 2, 3), and if proc 1 has one vertex it will be 4. So, to fix this, switch vertex numbering for 2 and 4. Mark On Mon, Oct 6, 2025 at 3:40?PM Zhang, Chonglin wrote: > Dear PETSc developers, > > > > I have some questions on the proper use of > DMPlexCreateFromCellListParallelPetsc function ( > https://urldefense.us/v3/__https://petsc.org/release/manualpages/DMPlex/DMPlexCreateFromCellListParallelPetsc/__;!!G_uCfscf7eWS!Z4nwRjrsrMEicB_kYCgaYAZ-kDyLwUVzbEsc7URFCva6KyGkrF5e8yjb_GiisPeXrchhomiHO0uaDLPIzn_nbBU$ > ). > I am upgrading my code?s PETSc dependency from v3.16.6 to v3.24.0 (and > v3.23.3, v3.23.6). I encountered crash with > DMPlexCreateFromCellListParallelPetsc function: > > - To show the crash, I modified the following test: > src/dm/impls/plex/tests/ex18.c > - Is there anything I am doing wrong when creating DMPlex using the > below mesh with this test? > - What is the order of vertex index (of own element) going into > DMPlexCreateFromCellListParallelPetsc function? > - What is the order of own vertex coordinates going into > DMPlexCreateFromCellListParallelPetsc function? > > > > Here is a detailed descriptions of what I did with ex18.c to show my > issue: src/dm/impls/plex/tests/ex18.c > > - A simple 2D square mesh with 5 vertices and 4 triangles, shared by 2 > MPI ranks (see the below image for the mesh, also attached file). > - Each MPI rank owns 2 elements. > - MPI rank 0 owns vertex indexed as: 0, 1, 3, 4; rank 1 owns vertex > indexed as: 2. > - Modified code (with updated mesh information) is attached. > - The DM view output is also attached. > - Note: the original test with 2 triangles and 2 elements was running > fine. > > > > > > Using this new mesh, the test crashed with the following error message: > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Mesh cell 1 of type triangle is inverted, |J| = 0. > [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Z4nwRjrsrMEicB_kYCgaYAZ-kDyLwUVzbEsc7URFCva6KyGkrF5e8yjb_GiisPeXrchhomiHO0uaDLPI91Szcc0$ > > for trouble shooting. > [0]PETSC ERROR: PETSc Release Version 3.24.0, unknown > [0]PETSC ERROR: ./ex18 with 2 MPI process(es) and PETSC_ARCH > arch-centos_kokkos on boltzmann2 by zhangc20 Mon Oct 6 14:00:48 2025 > [0]PETSC ERROR: Configure options: --with-cc=mpicc --with-cxx=mpicxx > --with-fc=mpif90 --with-shared-libraries=1 --with-debugging=no > --COPTFLAGS="-g -O2 -fPIC" --CXXOPTFLAGS="-g -O2 -fPIC" --FOPTFLAGS="-g -O2 > -fPIC" --with-cuda=1 --with-cuda-arch=86 --with-cudac=nvcc --with-kokkos=1 > --with-kokkos-dir=./../install/kokkos/install/ --with-kokkos-kernels=1 > --with-kokkos-kernels-dir=./../install/kokkos-kernels/install/ > --download-metis --download-parmetis --download-fblaslapack=1 > --download-triangle --with-make-np=8 PETSC_ARCH=arch-centos_kokkos > [0]PETSC ERROR: #1 DMPlexCheckGeometry() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9630 > [0]PETSC ERROR: #2 DMPlexCheck() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9807 > [0]PETSC ERROR: #3 DMSetFromOptions_NonRefinement_Plex() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5121 > [0]PETSC ERROR: #4 DMSetFromOptions_Plex() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5601 > [0]PETSC ERROR: #5 DMSetFromOptions() at > /hdd1/dsmc/comet/comet/petsc/src/dm/interface/dm.c:907 > [0]PETSC ERROR: #6 CreateMesh() at ex18.c:811 > [0]PETSC ERROR: #7 main() at ex18.c:1527 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -dm_plex_check_all 1 (source: command line) > [0]PETSC ERROR: -dm_view ascii:dm_mesh.txt:ascii_info_detail (source: > command line) > [0]PETSC ERROR: -interpolate create (source: command line) > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > Abort(62) on node 0 (rank 0 in comm 16): application called > MPI_Abort(MPI_COMM_SELF, 62) - process 0 > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Invalid argument > [1]PETSC ERROR: Mesh cell 0 of type triangle is inverted, |J| = -0.25 > [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Z4nwRjrsrMEicB_kYCgaYAZ-kDyLwUVzbEsc7URFCva6KyGkrF5e8yjb_GiisPeXrchhomiHO0uaDLPI91Szcc0$ > > for trouble shooting. > [1]PETSC ERROR: PETSc Release Version 3.24.0, unknown > [1]PETSC ERROR: ./ex18 with 2 MPI process(es) and PETSC_ARCH > arch-centos_kokkos on boltzmann2 by zhangc20 Mon Oct 6 14:00:48 2025 > [1]PETSC ERROR: Configure options: --with-cc=mpicc --with-cxx=mpicxx > --with-fc=mpif90 --with-shared-libraries=1 --with-debugging=no > --COPTFLAGS="-g -O2 -fPIC" --CXXOPTFLAGS="-g -O2 -fPIC" --FOPTFLAGS="-g -O2 > -fPIC" --with-cuda=1 --with-cuda-arch=86 --with-cudac=nvcc --with-kokkos=1 > --with-kokkos-dir=./../install/kokkos/install/ --with-kokkos-kernels=1 > --with-kokkos-kernels-dir=./../install/kokkos-kernels/install/ > --download-metis --download-parmetis --download-fblaslapack=1 > --download-triangle --with-make-np=8 PETSC_ARCH=arch-centos_kokkos > [1]PETSC ERROR: #1 DMPlexCheckGeometry() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9630 > [1]PETSC ERROR: #2 DMPlexCheck() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9807 > [1]PETSC ERROR: #3 DMSetFromOptions_NonRefinement_Plex() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5121 > [1]PETSC ERROR: #4 DMSetFromOptions_Plex() at > /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5601 > [1]PETSC ERROR: #5 DMSetFromOptions() at > /hdd1/dsmc/comet/comet/petsc/src/dm/interface/dm.c:907 > [1]PETSC ERROR: #6 CreateMesh() at ex18.c:811 > [1]PETSC ERROR: #7 main() at ex18.c:1527 > [1]PETSC ERROR: PETSc Option Table entries: > [1]PETSC ERROR: -dm_plex_check_all 1 (source: command line) > [1]PETSC ERROR: -dm_view ascii:dm_mesh.txt:ascii_info_detail (source: > command line) > [1]PETSC ERROR: -interpolate create (source: command line) > [1]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > Abort(62) on node 1 (rank 0 in comm 16): application called > MPI_Abort(MPI_COMM_SELF, 62) - process 0 > > > > Run script used: > > mpirun -np 2 ./ex18 -dm_plex_check_all 1 -dm_view > ascii:dm_mesh.txt:ascii_info_detail -interpolate create > > > > Thanks, > > Chonglin > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 454644 bytes Desc: not available URL: From chonglin.zhang at und.edu Mon Oct 6 19:38:27 2025 From: chonglin.zhang at und.edu (Zhang, Chonglin) Date: Tue, 7 Oct 2025 00:38:27 +0000 Subject: [petsc-users] Help with DMPlexCreateFromCellListParallelPetsc In-Reply-To: References: Message-ID: Hi Mark, Thank you for your help with this missing information. This solved the issue. Looking at the manual again for DMPlexBuildFromCellListParallel (internal function being called, https://urldefense.us/v3/__https://petsc.org/release/manualpages/DMPlex/DMPlexBuildFromCellListParallel/__;!!G_uCfscf7eWS!fmyMzLlbHCS_e22XlTJPzQcOaVEWo_GQ0mCMiB_G5VBovdxOfFs0mqg7g_0rox9TP8iRN_VBCeL74-xIX_Zf1aPcEnI03ePX$ ), it did state that: ?Each process owns a chunk of numVertices consecutive vertices.?, which may be implying what you said, and I missed that important information. Thanks, Chonglin From: Mark Adams Date: Monday, October 6, 2025 at 5:27?PM To: Zhang, Chonglin Cc: PETSc User Mailing List Subject: Re: [petsc-users] Help with DMPlexCreateFromCellListParallelPetsc I think you need to reorder your points according to each processor. If process 0 has 4 vertices, they will be numbered (0, 1, 2, 3), and if proc 1 has one vertex it will be 4. So, to fix this, switch vertex numbering for 2 and 4. Mark On Mon, Oct 6, 2025 at 3:40?PM Zhang, Chonglin > wrote: Dear PETSc developers, I have some questions on the proper use of DMPlexCreateFromCellListParallelPetsc function (https://urldefense.us/v3/__https://petsc.org/release/manualpages/DMPlex/DMPlexCreateFromCellListParallelPetsc/__;!!G_uCfscf7eWS!fmyMzLlbHCS_e22XlTJPzQcOaVEWo_GQ0mCMiB_G5VBovdxOfFs0mqg7g_0rox9TP8iRN_VBCeL74-xIX_Zf1aPcEqZ4IMoI$ ). I am upgrading my code?s PETSc dependency from v3.16.6 to v3.24.0 (and v3.23.3, v3.23.6). I encountered crash with DMPlexCreateFromCellListParallelPetsc function: ? To show the crash, I modified the following test: src/dm/impls/plex/tests/ex18.c ? Is there anything I am doing wrong when creating DMPlex using the below mesh with this test? ? What is the order of vertex index (of own element) going into DMPlexCreateFromCellListParallelPetsc function? ? What is the order of own vertex coordinates going into DMPlexCreateFromCellListParallelPetsc function? Here is a detailed descriptions of what I did with ex18.c to show my issue: src/dm/impls/plex/tests/ex18.c ? A simple 2D square mesh with 5 vertices and 4 triangles, shared by 2 MPI ranks (see the below image for the mesh, also attached file). ? Each MPI rank owns 2 elements. ? MPI rank 0 owns vertex indexed as: 0, 1, 3, 4; rank 1 owns vertex indexed as: 2. ? Modified code (with updated mesh information) is attached. ? The DM view output is also attached. ? Note: the original test with 2 triangles and 2 elements was running fine. [cid:ii_199bb9d3c9b4ce8e91] Using this new mesh, the test crashed with the following error message: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Mesh cell 1 of type triangle is inverted, |J| = 0. [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fmyMzLlbHCS_e22XlTJPzQcOaVEWo_GQ0mCMiB_G5VBovdxOfFs0mqg7g_0rox9TP8iRN_VBCeL74-xIX_Zf1aPcEix_gsiw$ for trouble shooting. [0]PETSC ERROR: PETSc Release Version 3.24.0, unknown [0]PETSC ERROR: ./ex18 with 2 MPI process(es) and PETSC_ARCH arch-centos_kokkos on boltzmann2 by zhangc20 Mon Oct 6 14:00:48 2025 [0]PETSC ERROR: Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-shared-libraries=1 --with-debugging=no --COPTFLAGS="-g -O2 -fPIC" --CXXOPTFLAGS="-g -O2 -fPIC" --FOPTFLAGS="-g -O2 -fPIC" --with-cuda=1 --with-cuda-arch=86 --with-cudac=nvcc --with-kokkos=1 --with-kokkos-dir=./../install/kokkos/install/ --with-kokkos-kernels=1 --with-kokkos-kernels-dir=./../install/kokkos-kernels/install/ --download-metis --download-parmetis --download-fblaslapack=1 --download-triangle --with-make-np=8 PETSC_ARCH=arch-centos_kokkos [0]PETSC ERROR: #1 DMPlexCheckGeometry() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9630 [0]PETSC ERROR: #2 DMPlexCheck() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9807 [0]PETSC ERROR: #3 DMSetFromOptions_NonRefinement_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5121 [0]PETSC ERROR: #4 DMSetFromOptions_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5601 [0]PETSC ERROR: #5 DMSetFromOptions() at /hdd1/dsmc/comet/comet/petsc/src/dm/interface/dm.c:907 [0]PETSC ERROR: #6 CreateMesh() at ex18.c:811 [0]PETSC ERROR: #7 main() at ex18.c:1527 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -dm_plex_check_all 1 (source: command line) [0]PETSC ERROR: -dm_view ascii:dm_mesh.txt:ascii_info_detail (source: command line) [0]PETSC ERROR: -interpolate create (source: command line) [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 0 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Invalid argument [1]PETSC ERROR: Mesh cell 0 of type triangle is inverted, |J| = -0.25 [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fmyMzLlbHCS_e22XlTJPzQcOaVEWo_GQ0mCMiB_G5VBovdxOfFs0mqg7g_0rox9TP8iRN_VBCeL74-xIX_Zf1aPcEix_gsiw$ for trouble shooting. [1]PETSC ERROR: PETSc Release Version 3.24.0, unknown [1]PETSC ERROR: ./ex18 with 2 MPI process(es) and PETSC_ARCH arch-centos_kokkos on boltzmann2 by zhangc20 Mon Oct 6 14:00:48 2025 [1]PETSC ERROR: Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-shared-libraries=1 --with-debugging=no --COPTFLAGS="-g -O2 -fPIC" --CXXOPTFLAGS="-g -O2 -fPIC" --FOPTFLAGS="-g -O2 -fPIC" --with-cuda=1 --with-cuda-arch=86 --with-cudac=nvcc --with-kokkos=1 --with-kokkos-dir=./../install/kokkos/install/ --with-kokkos-kernels=1 --with-kokkos-kernels-dir=./../install/kokkos-kernels/install/ --download-metis --download-parmetis --download-fblaslapack=1 --download-triangle --with-make-np=8 PETSC_ARCH=arch-centos_kokkos [1]PETSC ERROR: #1 DMPlexCheckGeometry() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9630 [1]PETSC ERROR: #2 DMPlexCheck() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plex.c:9807 [1]PETSC ERROR: #3 DMSetFromOptions_NonRefinement_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5121 [1]PETSC ERROR: #4 DMSetFromOptions_Plex() at /hdd1/dsmc/comet/comet/petsc/src/dm/impls/plex/plexcreate.c:5601 [1]PETSC ERROR: #5 DMSetFromOptions() at /hdd1/dsmc/comet/comet/petsc/src/dm/interface/dm.c:907 [1]PETSC ERROR: #6 CreateMesh() at ex18.c:811 [1]PETSC ERROR: #7 main() at ex18.c:1527 [1]PETSC ERROR: PETSc Option Table entries: [1]PETSC ERROR: -dm_plex_check_all 1 (source: command line) [1]PETSC ERROR: -dm_view ascii:dm_mesh.txt:ascii_info_detail (source: command line) [1]PETSC ERROR: -interpolate create (source: command line) [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 1 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 Run script used: mpirun -np 2 ./ex18 -dm_plex_check_all 1 -dm_view ascii:dm_mesh.txt:ascii_info_detail -interpolate create Thanks, Chonglin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 454644 bytes Desc: image001.jpg URL: From Elena.Moral.Sanchez at ipp.mpg.de Tue Oct 7 03:12:44 2025 From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena) Date: Tue, 7 Oct 2025 08:12:44 +0000 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev>, Message-ID: <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. Elena ________________________________ From: Mark Adams Sent: 01 October 2025 13:25:14 To: Barry Smith Cc: Moral Sanchez, Elena; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. Barry You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: This is what I get: Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.249726733143e+00 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.433120400946e+00 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 2 KSP Residual norm 1.169262560123e+00 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 3 KSP Residual norm 1.323528716607e+00 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 4 KSP Residual norm 5.006323254234e-01 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 5 KSP Residual norm 3.569836784785e-01 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 6 KSP Residual norm 2.493182937513e-01 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 7 KSP Residual norm 3.038202502298e-01 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 8 KSP Residual norm 2.780214194402e-01 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 9 KSP Residual norm 1.676826341491e-01 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 10 KSP Residual norm 1.209985378713e-01 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 11 KSP Residual norm 9.445076689969e-02 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 12 KSP Residual norm 8.308555284580e-02 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 13 KSP Residual norm 5.472865592585e-02 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 14 KSP Residual norm 4.357870564398e-02 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 15 KSP Residual norm 5.079681292439e-02 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 5.079681292439e-02 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 1 KSP Residual norm 2.934938644003e-02 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 2 KSP Residual norm 3.257065831294e-02 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 3 KSP Residual norm 4.143063876867e-02 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 4 KSP Residual norm 4.822471409489e-02 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 5 KSP Residual norm 3.197538246153e-02 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 6 KSP Residual norm 3.461217019835e-02 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 7 KSP Residual norm 3.410193775327e-02 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 8 KSP Residual norm 4.690424294464e-02 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 9 KSP Residual norm 3.366148892800e-02 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 10 KSP Residual norm 4.068015727689e-02 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 11 KSP Residual norm 2.658836123104e-02 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 12 KSP Residual norm 2.826244186003e-02 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 13 KSP Residual norm 2.981793619508e-02 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 14 KSP Residual norm 3.525455091450e-02 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 15 KSP Residual norm 2.331539121838e-02 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.421498365806e-02 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.761072112362e-02 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 2 KSP Residual norm 1.400842489042e-02 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 3 KSP Residual norm 1.419665483348e-02 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 4 KSP Residual norm 1.617590701667e-02 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 5 KSP Residual norm 1.354824081005e-02 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 6 KSP Residual norm 1.387252917475e-02 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 7 KSP Residual norm 1.514043102087e-02 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 8 KSP Residual norm 1.275811124745e-02 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 9 KSP Residual norm 1.241039155981e-02 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 10 KSP Residual norm 9.585207801652e-03 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 11 KSP Residual norm 9.022641230732e-03 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 12 KSP Residual norm 1.187709152046e-02 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 13 KSP Residual norm 1.084880112494e-02 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 14 KSP Residual norm 8.194750346781e-03 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 15 KSP Residual norm 7.614246199165e-03 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 7.614246199165e-03 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 1 KSP Residual norm 5.620014684145e-03 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 2 KSP Residual norm 6.643368363907e-03 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 3 KSP Residual norm 8.708642393659e-03 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 4 KSP Residual norm 6.401852907459e-03 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 5 KSP Residual norm 7.230576215262e-03 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 6 KSP Residual norm 6.204081601285e-03 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 7 KSP Residual norm 7.038656665944e-03 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 8 KSP Residual norm 7.194079694050e-03 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 9 KSP Residual norm 6.353576889135e-03 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 10 KSP Residual norm 7.313589502731e-03 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 11 KSP Residual norm 6.643320423193e-03 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 12 KSP Residual norm 7.235443182108e-03 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 13 KSP Residual norm 4.971292307201e-03 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 14 KSP Residual norm 5.357933842147e-03 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 15 KSP Residual norm 5.841682994497e-03 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Cheers, Elena ________________________________ From: Barry Smith > Sent: 29 September 2025 20:31:26 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Thanks. I missed something earlier in the KSPView using UNPRECONDITIONED norm type for convergence test Please add the options -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. Barry On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: This is the output: Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.249726733143e+00 1 KSP Residual norm 1.433120400946e+00 2 KSP Residual norm 1.169262560123e+00 3 KSP Residual norm 1.323528716607e+00 4 KSP Residual norm 5.006323254234e-01 5 KSP Residual norm 3.569836784785e-01 6 KSP Residual norm 2.493182937513e-01 7 KSP Residual norm 3.038202502298e-01 8 KSP Residual norm 2.780214194402e-01 9 KSP Residual norm 1.676826341491e-01 10 KSP Residual norm 1.209985378713e-01 11 KSP Residual norm 9.445076689969e-02 12 KSP Residual norm 8.308555284580e-02 13 KSP Residual norm 5.472865592585e-02 14 KSP Residual norm 4.357870564398e-02 15 KSP Residual norm 5.079681292439e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 5.079681292439e-02 1 KSP Residual norm 2.934938644003e-02 2 KSP Residual norm 3.257065831294e-02 3 KSP Residual norm 4.143063876867e-02 4 KSP Residual norm 4.822471409489e-02 5 KSP Residual norm 3.197538246153e-02 6 KSP Residual norm 3.461217019835e-02 7 KSP Residual norm 3.410193775327e-02 8 KSP Residual norm 4.690424294464e-02 9 KSP Residual norm 3.366148892800e-02 10 KSP Residual norm 4.068015727689e-02 11 KSP Residual norm 2.658836123104e-02 12 KSP Residual norm 2.826244186003e-02 13 KSP Residual norm 2.981793619508e-02 14 KSP Residual norm 3.525455091450e-02 15 KSP Residual norm 2.331539121838e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.421498365806e-02 1 KSP Residual norm 1.761072112362e-02 2 KSP Residual norm 1.400842489042e-02 3 KSP Residual norm 1.419665483348e-02 4 KSP Residual norm 1.617590701667e-02 5 KSP Residual norm 1.354824081005e-02 6 KSP Residual norm 1.387252917475e-02 7 KSP Residual norm 1.514043102087e-02 8 KSP Residual norm 1.275811124745e-02 9 KSP Residual norm 1.241039155981e-02 10 KSP Residual norm 9.585207801652e-03 11 KSP Residual norm 9.022641230732e-03 12 KSP Residual norm 1.187709152046e-02 13 KSP Residual norm 1.084880112494e-02 14 KSP Residual norm 8.194750346781e-03 15 KSP Residual norm 7.614246199165e-03 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 7.614246199165e-03 1 KSP Residual norm 5.620014684145e-03 2 KSP Residual norm 6.643368363907e-03 3 KSP Residual norm 8.708642393659e-03 4 KSP Residual norm 6.401852907459e-03 5 KSP Residual norm 7.230576215262e-03 6 KSP Residual norm 6.204081601285e-03 7 KSP Residual norm 7.038656665944e-03 8 KSP Residual norm 7.194079694050e-03 9 KSP Residual norm 6.353576889135e-03 10 KSP Residual norm 7.313589502731e-03 11 KSP Residual norm 6.643320423193e-03 12 KSP Residual norm 7.235443182108e-03 13 KSP Residual norm 4.971292307201e-03 14 KSP Residual norm 5.357933842147e-03 15 KSP Residual norm 5.841682994497e-03 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 ________________________________ From: Barry Smith > Sent: 29 September 2025 15:56:33 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level I asked you to run with -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason you chose not to, delaying the process of understanding what is happening. Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. Barry On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. ________________________________ From: Mark Adams > Sent: 29 September 2025 14:20:56 To: Moral Sanchez, Elena Cc: Barry Smith; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Oh I see the coarse grid solver in your full solver output now. You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 4 CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 CG ConvergedReason: -3 For completeness, I add here the -ksp_view of the whole solver: KSP Object: 1 MPI process type: cg variant HERMITIAN maximum iterations=1, nonzero initial guess tolerances: relative=1e-08, absolute=1e-09, divergence=10000. left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: mg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level 0 ------------------------------- KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=15, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=15, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=884, cols=884 Python: Solver_petsc.LeastSquaresOperator Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=884, cols=884 Python: Solver_petsc.LeastSquaresOperator Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. Thanks for the help, Elena ________________________________ From: Mark Adams > Sent: 28 September 2025 20:13:54 To: Barry Smith Cc: Moral Sanchez, Elena; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Not sure why your "whole"solver does not have a coarse grid but this is wrong: KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: Looks reasonable. Send the output running with -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: Dear Barry, This is -ksp_view for the smoother at the finest level: KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=10, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator And at the coarsest level: KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=344, cols=344 Python: Solver_petsc.LeastSquaresOperator And for the whole solver: KSP Object: 1 MPI process type: cg variant HERMITIAN maximum iterations=100, nonzero initial guess tolerances: relative=1e-08, absolute=1e-09, divergence=10000. left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: mg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level 0 ------------------------------- KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=344, cols=344 Python: Solver_petsc.LeastSquaresOperator Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=10, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Best, Elena ________________________________ From: Barry Smith > Sent: 26 September 2025 19:05:02 To: Moral Sanchez, Elena Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Send the output using -ksp_view Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. Barry On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: Hi, I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. Do you know how to make the smoother at the finest level stop when reaching the tolerance? Cheers, Elena. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldo.bonfiglioli at unibas.it Tue Oct 7 08:24:06 2025 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Tue, 7 Oct 2025 15:24:06 +0200 Subject: [petsc-users] Advice on creating vectors defined on lower dimensional manifolds of a DMPlex In-Reply-To: References: Message-ID: <05c6b4b6-2da0-4060-82b0-9c8b55da35fd@unibas.it> On 10/6/25 20:10, Matthew Knepley wrote: > On Mon, Oct 6, 2025 at 1:11?PM Aldo Bonfiglioli > wrote: > > Dear all, > > what is the best approach for defining vectors that "sit" on the > (vertices and/or faces) of a given stratum of the "Face Sets" of a > DMPlex? > >> DM Object: 3D plex 1 MPI process >> ?type: plex >> 3D plex in 3 dimensions: >> ?Number of 0-cells per rank: 9261 >> ?Number of 1-cells per rank: 59660 >> ?Number of 2-cells per rank: 98400 >> ?Number of 3-cells per rank: 48000 >> Labels: >> ?marker: 1 strata with value/size (1 (14402)) >> ?celltype: 4 strata with value/size (0 (9261), 1 (59660), 3 >> (98400), 6 (48000)) >> ?depth: 4 strata with value/size (0 (9261), 1 (59660), 2 (98400), >> 3 (48000)) >> ?Face Sets: 6 strata with value/size (1 (800), 2 (800), 3 (800), >> 4 (800), 5 (800), 6 (800)) >> > These vectors are going to be used (for example) to store stresses > and heat flux on solid surfaces. > > To be more specific: suppose stratum 3 of the "Face Sets" is a > solid wall. > > I want to create a vector that that stores quantities computed on > the (800) faces of that wall OR the vertices of that wall. > > > It should be simple to just create such vectors. You request a submesh > using that label > > ? DM subdm; > ? DMLabel label, sublabel; > > ? DMGetLabel(dm, "Face Sets", &label); > ? DMLabelDuplicate(label, &sublabel); > ? DMPlexLabelComplete(dm, sublabel); > ? DMPlexCreateSubmesh(dm, sublabel, 3, PETSC_TRUE, &subdm) > ? DMLabelDestroy(&sublabel); > > Now you can define a PetscFE?over this submesh in the same way as any > other mesh. Moreover, the subdm contains a mapping back to the > original DM, from which you can create a mapping of dofs, so that you > can inject the subvector into a larger field if you wish. > > If you want to use fields on submeshes inside a PetscDS, so that the > Plex manages the solve, the procedure is slightly different, but I can > detail it if you want. > > ? Thanks, > > ? ? ?Matt > > Thanks, > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Mechanics > Dipartimento di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Yk-uGKnWGfd4b87jjp07cII-ta6Rld0p_kmImii1ZuYnG2DCFwjpUQkeQ8z3mrWOHPd6-oKoBkZsW8rkoKSWFblHNxPyamZ0CHI$ > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yk-uGKnWGfd4b87jjp07cII-ta6Rld0p_kmImii1ZuYnG2DCFwjpUQkeQ8z3mrWOHPd6-oKoBkZsW8rkoKSWFblHNxPykk2-vZU$ > Matthew, I followed your suggestions, but I face a deadlock when the enclosed demonstrator is run in parallel (for the attached dotfile, deadlock occurs when nproc = 3). I suspect it might be due to the fact that in a parallel environment the various strata of the "Face Sets" are not necessarily available to all processes. Therefore, not all processes are going to call DMPlexCreateSubmesh with a given "value". The attached piece of code links with petsc-3.24.0. Thanks, Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Mechanics Dipartimento di Ingegneria Universita' della Basilicata V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Yk-uGKnWGfd4b87jjp07cII-ta6Rld0p_kmImii1ZuYnG2DCFwjpUQkeQ8z3mrWOHPd6-oKoBkZsW8rkoKSWFblHNxPyamZ0CHI$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rgmsh.F90 Type: text/x-fortran Size: 8893 bytes Desc: not available URL: -------------- next part -------------- -dm_plex_dim 3 -dm_plex_shape box -dm_plex_box_faces 20,20,20 -dm_plex_box_lower 0.,0.,0. -dm_plex_box_upper 1.,1.,1. ##-dm_plex_filename cube6.msh -dm_plex_simplex true -dm_plex_interpolate -dm_plex_check_all #-dm_view ##-dm_plex_view_labels "marker" ##-dm_plex_view_labels "Face Sets" -petscpartitioner_view ####-dm_petscsection_view -options_left From knepley at gmail.com Tue Oct 7 09:23:51 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 7 Oct 2025 10:23:51 -0400 Subject: [petsc-users] Advice on creating vectors defined on lower dimensional manifolds of a DMPlex In-Reply-To: <05c6b4b6-2da0-4060-82b0-9c8b55da35fd@unibas.it> References: <05c6b4b6-2da0-4060-82b0-9c8b55da35fd@unibas.it> Message-ID: On Tue, Oct 7, 2025 at 9:24?AM Aldo Bonfiglioli wrote: > On 10/6/25 20:10, Matthew Knepley wrote: > > On Mon, Oct 6, 2025 at 1:11?PM Aldo Bonfiglioli < > aldo.bonfiglioli at unibas.it> wrote: > >> Dear all, >> >> what is the best approach for defining vectors that "sit" on the >> (vertices and/or faces) of a given stratum of the "Face Sets" of a DMPlex? >> >> DM Object: 3D plex 1 MPI process >> type: plex >> 3D plex in 3 dimensions: >> Number of 0-cells per rank: 9261 >> Number of 1-cells per rank: 59660 >> Number of 2-cells per rank: 98400 >> Number of 3-cells per rank: 48000 >> Labels: >> marker: 1 strata with value/size (1 (14402)) >> celltype: 4 strata with value/size (0 (9261), 1 (59660), 3 (98400), 6 >> (48000)) >> depth: 4 strata with value/size (0 (9261), 1 (59660), 2 (98400), 3 >> (48000)) >> Face Sets: 6 strata with value/size (1 (800), 2 (800), 3 (800), 4 (800), >> 5 (800), 6 (800)) >> >> These vectors are going to be used (for example) to store stresses and >> heat flux on solid surfaces. >> >> To be more specific: suppose stratum 3 of the "Face Sets" is a solid >> wall. >> >> I want to create a vector that that stores quantities computed on the >> (800) faces of that wall OR the vertices of that wall. >> > > It should be simple to just create such vectors. You request a submesh > using that label > > DM subdm; > DMLabel label, sublabel; > > DMGetLabel(dm, "Face Sets", &label); > DMLabelDuplicate(label, &sublabel); > DMPlexLabelComplete(dm, sublabel); > DMPlexCreateSubmesh(dm, sublabel, 3, PETSC_TRUE, &subdm) > DMLabelDestroy(&sublabel); > > Now you can define a PetscFE over this submesh in the same way as any > other mesh. Moreover, the subdm contains a mapping back to the original DM, > from which you can create a mapping of dofs, so that you can inject the > subvector into a larger field if you wish. > > If you want to use fields on submeshes inside a PetscDS, so that the Plex > manages the solve, the procedure is slightly different, but I can detail it > if you want. > > Thanks, > > Matt > >> Thanks, >> >> Aldo >> >> -- >> Dr. Aldo Bonfiglioli >> Associate professor of Fluid Mechanics >> Dipartimento di Ingegneria >> Universita' della Basilicata >> V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY >> tel:+39.0971.205203 fax:+39.0971.205215 >> web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!dqmn8J_g7VGPOtEwYJpwJ4cJ8KKVW3j_qflz1ied-mBFEqfnmodJ-OkWqbHntOdvfcELJcGln4S5B_zmPsvc$ >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dqmn8J_g7VGPOtEwYJpwJ4cJ8KKVW3j_qflz1ied-mBFEqfnmodJ-OkWqbHntOdvfcELJcGln4S5B9LrRWeW$ > > > Matthew, > > I followed your suggestions, but I face a deadlock when the enclosed > demonstrator is run in parallel (for the attached dotfile, deadlock occurs > when nproc = 3). > > I suspect it might be due to the fact that in a parallel environment the > various strata of the "Face Sets" are not necessarily available to all > processes. > > Therefore, not all processes are going to call DMPlexCreateSubmesh with a > given "value". > > DMPlexCreateSubmesh() is collective. You have to call it with identical arguments, even if the result is empty on some process. Thanks, Matt > The attached piece of code links with petsc-3.24.0. > > Thanks, > > Aldo > > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Mechanics > Dipartimento di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!dqmn8J_g7VGPOtEwYJpwJ4cJ8KKVW3j_qflz1ied-mBFEqfnmodJ-OkWqbHntOdvfcELJcGln4S5B_zmPsvc$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dqmn8J_g7VGPOtEwYJpwJ4cJ8KKVW3j_qflz1ied-mBFEqfnmodJ-OkWqbHntOdvfcELJcGln4S5B9LrRWeW$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Oct 7 09:53:26 2025 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 7 Oct 2025 10:53:26 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> Message-ID: I have to apologize again. What you are doing is so out of the ordinary (but there is nothing wrong with you doing it) that I totally lost this line of code PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); Please try the following, add the options -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned Barry > On Oct 7, 2025, at 4:12?AM, Moral Sanchez, Elena wrote: > > The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. > > Elena > > From: Mark Adams > > Sent: 01 October 2025 13:25:14 > To: Barry Smith > Cc: Moral Sanchez, Elena; petsc-users > Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level > > Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. > > On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: >> >> Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. >> >> Barry >> >> You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, >> >>> On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: >>> >>> This is what I get: >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP Residual norm 2.249726733143e+00 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP Residual norm 1.433120400946e+00 >>> 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 >>> 2 KSP Residual norm 1.169262560123e+00 >>> 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 >>> 3 KSP Residual norm 1.323528716607e+00 >>> 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 >>> 4 KSP Residual norm 5.006323254234e-01 >>> 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 >>> 5 KSP Residual norm 3.569836784785e-01 >>> 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 >>> 6 KSP Residual norm 2.493182937513e-01 >>> 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 >>> 7 KSP Residual norm 3.038202502298e-01 >>> 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 >>> 8 KSP Residual norm 2.780214194402e-01 >>> 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 >>> 9 KSP Residual norm 1.676826341491e-01 >>> 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 >>> 10 KSP Residual norm 1.209985378713e-01 >>> 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 >>> 11 KSP Residual norm 9.445076689969e-02 >>> 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 >>> 12 KSP Residual norm 8.308555284580e-02 >>> 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 >>> 13 KSP Residual norm 5.472865592585e-02 >>> 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 >>> 14 KSP Residual norm 4.357870564398e-02 >>> 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 >>> 15 KSP Residual norm 5.079681292439e-02 >>> 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 >>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP Residual norm 5.079681292439e-02 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 >>> 1 KSP Residual norm 2.934938644003e-02 >>> 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 >>> 2 KSP Residual norm 3.257065831294e-02 >>> 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 >>> 3 KSP Residual norm 4.143063876867e-02 >>> 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 >>> 4 KSP Residual norm 4.822471409489e-02 >>> 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 >>> 5 KSP Residual norm 3.197538246153e-02 >>> 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 >>> 6 KSP Residual norm 3.461217019835e-02 >>> 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 >>> 7 KSP Residual norm 3.410193775327e-02 >>> 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 >>> 8 KSP Residual norm 4.690424294464e-02 >>> 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 >>> 9 KSP Residual norm 3.366148892800e-02 >>> 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 >>> 10 KSP Residual norm 4.068015727689e-02 >>> 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 >>> 11 KSP Residual norm 2.658836123104e-02 >>> 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 >>> 12 KSP Residual norm 2.826244186003e-02 >>> 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 >>> 13 KSP Residual norm 2.981793619508e-02 >>> 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 >>> 14 KSP Residual norm 3.525455091450e-02 >>> 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 >>> 15 KSP Residual norm 2.331539121838e-02 >>> 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 >>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP Residual norm 2.421498365806e-02 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP Residual norm 1.761072112362e-02 >>> 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 >>> 2 KSP Residual norm 1.400842489042e-02 >>> 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 >>> 3 KSP Residual norm 1.419665483348e-02 >>> 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 >>> 4 KSP Residual norm 1.617590701667e-02 >>> 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 >>> 5 KSP Residual norm 1.354824081005e-02 >>> 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 >>> 6 KSP Residual norm 1.387252917475e-02 >>> 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 >>> 7 KSP Residual norm 1.514043102087e-02 >>> 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 >>> 8 KSP Residual norm 1.275811124745e-02 >>> 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 >>> 9 KSP Residual norm 1.241039155981e-02 >>> 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 >>> 10 KSP Residual norm 9.585207801652e-03 >>> 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 >>> 11 KSP Residual norm 9.022641230732e-03 >>> 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 >>> 12 KSP Residual norm 1.187709152046e-02 >>> 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 >>> 13 KSP Residual norm 1.084880112494e-02 >>> 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 >>> 14 KSP Residual norm 8.194750346781e-03 >>> 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 >>> 15 KSP Residual norm 7.614246199165e-03 >>> 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP Residual norm 7.614246199165e-03 >>> Residual norms for mg_levels_1_ solve. >>> 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>> 1 KSP Residual norm 5.620014684145e-03 >>> 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 >>> 2 KSP Residual norm 6.643368363907e-03 >>> 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 >>> 3 KSP Residual norm 8.708642393659e-03 >>> 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 >>> 4 KSP Residual norm 6.401852907459e-03 >>> 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 >>> 5 KSP Residual norm 7.230576215262e-03 >>> 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 >>> 6 KSP Residual norm 6.204081601285e-03 >>> 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 >>> 7 KSP Residual norm 7.038656665944e-03 >>> 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 >>> 8 KSP Residual norm 7.194079694050e-03 >>> 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 >>> 9 KSP Residual norm 6.353576889135e-03 >>> 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 >>> 10 KSP Residual norm 7.313589502731e-03 >>> 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 >>> 11 KSP Residual norm 6.643320423193e-03 >>> 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 >>> 12 KSP Residual norm 7.235443182108e-03 >>> 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 >>> 13 KSP Residual norm 4.971292307201e-03 >>> 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 >>> 14 KSP Residual norm 5.357933842147e-03 >>> 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 >>> 15 KSP Residual norm 5.841682994497e-03 >>> 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 >>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>> Cheers, >>> Elena >>> From: Barry Smith > >>> Sent: 29 September 2025 20:31:26 >>> To: Moral Sanchez, Elena >>> Cc: Mark Adams; petsc-users >>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>> >>> >>> Thanks. I missed something earlier in the KSPView >>> >>>>> using UNPRECONDITIONED norm type for convergence test >>> >>> Please add the options >>> >>>>>>> -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual >>> >>> It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. >>> >>> Barry >>> >>> >>>> On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: >>>> >>>> This is the output: >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 2.249726733143e+00 >>>> 1 KSP Residual norm 1.433120400946e+00 >>>> 2 KSP Residual norm 1.169262560123e+00 >>>> 3 KSP Residual norm 1.323528716607e+00 >>>> 4 KSP Residual norm 5.006323254234e-01 >>>> 5 KSP Residual norm 3.569836784785e-01 >>>> 6 KSP Residual norm 2.493182937513e-01 >>>> 7 KSP Residual norm 3.038202502298e-01 >>>> 8 KSP Residual norm 2.780214194402e-01 >>>> 9 KSP Residual norm 1.676826341491e-01 >>>> 10 KSP Residual norm 1.209985378713e-01 >>>> 11 KSP Residual norm 9.445076689969e-02 >>>> 12 KSP Residual norm 8.308555284580e-02 >>>> 13 KSP Residual norm 5.472865592585e-02 >>>> 14 KSP Residual norm 4.357870564398e-02 >>>> 15 KSP Residual norm 5.079681292439e-02 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 5.079681292439e-02 >>>> 1 KSP Residual norm 2.934938644003e-02 >>>> 2 KSP Residual norm 3.257065831294e-02 >>>> 3 KSP Residual norm 4.143063876867e-02 >>>> 4 KSP Residual norm 4.822471409489e-02 >>>> 5 KSP Residual norm 3.197538246153e-02 >>>> 6 KSP Residual norm 3.461217019835e-02 >>>> 7 KSP Residual norm 3.410193775327e-02 >>>> 8 KSP Residual norm 4.690424294464e-02 >>>> 9 KSP Residual norm 3.366148892800e-02 >>>> 10 KSP Residual norm 4.068015727689e-02 >>>> 11 KSP Residual norm 2.658836123104e-02 >>>> 12 KSP Residual norm 2.826244186003e-02 >>>> 13 KSP Residual norm 2.981793619508e-02 >>>> 14 KSP Residual norm 3.525455091450e-02 >>>> 15 KSP Residual norm 2.331539121838e-02 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 2.421498365806e-02 >>>> 1 KSP Residual norm 1.761072112362e-02 >>>> 2 KSP Residual norm 1.400842489042e-02 >>>> 3 KSP Residual norm 1.419665483348e-02 >>>> 4 KSP Residual norm 1.617590701667e-02 >>>> 5 KSP Residual norm 1.354824081005e-02 >>>> 6 KSP Residual norm 1.387252917475e-02 >>>> 7 KSP Residual norm 1.514043102087e-02 >>>> 8 KSP Residual norm 1.275811124745e-02 >>>> 9 KSP Residual norm 1.241039155981e-02 >>>> 10 KSP Residual norm 9.585207801652e-03 >>>> 11 KSP Residual norm 9.022641230732e-03 >>>> 12 KSP Residual norm 1.187709152046e-02 >>>> 13 KSP Residual norm 1.084880112494e-02 >>>> 14 KSP Residual norm 8.194750346781e-03 >>>> 15 KSP Residual norm 7.614246199165e-03 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 7.614246199165e-03 >>>> 1 KSP Residual norm 5.620014684145e-03 >>>> 2 KSP Residual norm 6.643368363907e-03 >>>> 3 KSP Residual norm 8.708642393659e-03 >>>> 4 KSP Residual norm 6.401852907459e-03 >>>> 5 KSP Residual norm 7.230576215262e-03 >>>> 6 KSP Residual norm 6.204081601285e-03 >>>> 7 KSP Residual norm 7.038656665944e-03 >>>> 8 KSP Residual norm 7.194079694050e-03 >>>> 9 KSP Residual norm 6.353576889135e-03 >>>> 10 KSP Residual norm 7.313589502731e-03 >>>> 11 KSP Residual norm 6.643320423193e-03 >>>> 12 KSP Residual norm 7.235443182108e-03 >>>> 13 KSP Residual norm 4.971292307201e-03 >>>> 14 KSP Residual norm 5.357933842147e-03 >>>> 15 KSP Residual norm 5.841682994497e-03 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> >>>> From: Barry Smith > >>>> Sent: 29 September 2025 15:56:33 >>>> To: Moral Sanchez, Elena >>>> Cc: Mark Adams; petsc-users >>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>> >>>> >>>> I asked you to run with >>>> >>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>> >>>> you chose not to, delaying the process of understanding what is happening. >>>> >>>> Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. >>>> >>>> Barry >>>> >>>> >>>>> On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: >>>>> >>>>> Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. >>>>> >>>>> I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. >>>>> >>>>> From: Mark Adams > >>>>> Sent: 29 September 2025 14:20:56 >>>>> To: Moral Sanchez, Elena >>>>> Cc: Barry Smith; petsc-users >>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>> >>>>> Oh I see the coarse grid solver in your full solver output now. >>>>> You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. >>>>> >>>>> On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: >>>>>> Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: >>>>>> CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s >>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s >>>>>> ConvergedReason MG lvl 0: 4 >>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s >>>>>> ConvergedReason MG lvl -1: 3 >>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s >>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s >>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s >>>>>> ConvergedReason MG lvl 0: 4 >>>>>> CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s >>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s >>>>>> ConvergedReason MG lvl 0: 4 >>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s >>>>>> ConvergedReason MG lvl -1: 3 >>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s >>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s >>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s >>>>>> ConvergedReason MG lvl 0: 4 >>>>>> CG ConvergedReason: -3 >>>>>> >>>>>> For completeness, I add here the -ksp_view of the whole solver: >>>>>> KSP Object: 1 MPI process >>>>>> type: cg >>>>>> variant HERMITIAN >>>>>> maximum iterations=1, nonzero initial guess >>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>> left preconditioning >>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI process >>>>>> type: mg >>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>> Cycles per PCApply=1 >>>>>> Not using Galerkin computed coarse grid matrices >>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>> type: cg >>>>>> variant HERMITIAN >>>>>> maximum iterations=15, nonzero initial guess >>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>> left preconditioning >>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>> type: none >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: python >>>>>> rows=524, cols=524 >>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>> type: cg >>>>>> variant HERMITIAN >>>>>> maximum iterations=15, nonzero initial guess >>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>> left preconditioning >>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>> type: none >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: python >>>>>> rows=884, cols=884 >>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI process >>>>>> type: python >>>>>> rows=884, cols=884 >>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>> >>>>>> Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. >>>>>> Thanks for the help, >>>>>> Elena >>>>>> >>>>>> From: Mark Adams > >>>>>> Sent: 28 September 2025 20:13:54 >>>>>> To: Barry Smith >>>>>> Cc: Moral Sanchez, Elena; petsc-users >>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>> >>>>>> Not sure why your "whole"solver does not have a coarse grid but this is wrong: >>>>>> >>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>> type: cg >>>>>>> variant HERMITIAN >>>>>>> maximum iterations=100, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>> >>>>>>> The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) >>>>>> >>>>>> On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: >>>>>>> Looks reasonable. Send the output running with >>>>>>> >>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>>>> >>>>>>>> On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: >>>>>>>> >>>>>>>> Dear Barry, >>>>>>>> >>>>>>>> This is -ksp_view for the smoother at the finest level: >>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>> type: none >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=524, cols=524 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> And at the coarsest level: >>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>> type: none >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=344, cols=344 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> And for the whole solver: >>>>>>>> KSP Object: 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=100, nonzero initial guess >>>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: 1 MPI process >>>>>>>> type: mg >>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>> Cycles per PCApply=1 >>>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>> type: none >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=344, cols=344 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>> type: none >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=524, cols=524 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=524, cols=524 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> Best, >>>>>>>> Elena >>>>>>>> >>>>>>>> >>>>>>>> From: Barry Smith > >>>>>>>> Sent: 26 September 2025 19:05:02 >>>>>>>> To: Moral Sanchez, Elena >>>>>>>> Cc: petsc-users at mcs.anl.gov >>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>> >>>>>>>> >>>>>>>> Send the output using -ksp_view >>>>>>>> >>>>>>>> Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>>> On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. >>>>>>>>> >>>>>>>>> What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. >>>>>>>>> >>>>>>>>> I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. >>>>>>>>> >>>>>>>>> Do you know how to make the smoother at the finest level stop when reaching the tolerance? >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Elena. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Fri Oct 10 08:48:35 2025 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Fri, 10 Oct 2025 15:48:35 +0200 Subject: [petsc-users] extract arbitrary subset of a DMDA Message-ID: Dear all, ??? I am wondering if there is a way to extract a subset of a DMDA and use it as a mesh. The use case is to program a finite-difference method in which the domain is defined by a levelset function: if I could completely ignore the parts of the background DMDA that are "far away" from the object, I guess I would avoid some cores having almost no workload. I figure that I could setup a DMDA, load/compute the levelset on the entire box, then mark the nodes to be retained, extract the submesh and repartition it. I would also need a mean to transfer some Vec data from the DMDA to the new mesh. I guess that the extracted mesh would then become a DMPlex and it would not retain any DMDA flavour (like notions of which are the grid nodes sitting on top/bottom, left/right of a given node), right? Thanks ??? Matteo -- Prof. Matteo Semplice Universit? degli Studi dell?Insubria Dipartimento di Scienza e Alta Tecnologia ? DiSAT Professore Associato Via Valleggio, 11 ? 22100 Como (CO) ? Italia tel.: +39 031 2386316 From rlmackie862 at gmail.com Fri Oct 10 09:25:24 2025 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 10 Oct 2025 07:25:24 -0700 Subject: [petsc-users] extract arbitrary subset of a DMDA In-Reply-To: References: Message-ID: <3CA3DAB1-41DE-4DB5-B49C-2308BDC8537F@gmail.com> Hi Matteo, Take a look at these posts from a few years ago and see if they will help you: https://lists.mcs.anl.gov/pipermail/petsc-users/2021-January/043037.html https://lists.mcs.anl.gov/pipermail/petsc-users/2021-January/043043.html We were able to use this approach to extract a sub-region of a DMDA, and it should be possible for you to do so as well. Good luck, Randy M. > On Oct 10, 2025, at 6:48?AM, Matteo Semplice via petsc-users wrote: > > Dear all, > > I am wondering if there is a way to extract a subset of a DMDA and use it as a mesh. The use case is to program a finite-difference method in which the domain is defined by a levelset function: if I could completely ignore the parts of the background DMDA that are "far away" from the object, I guess I would avoid some cores having almost no workload. I figure that I could setup a DMDA, load/compute the levelset on the entire box, then mark the nodes to be retained, extract the submesh and repartition it. I would also need a mean to transfer some Vec data from the DMDA to the new mesh. > > I guess that the extracted mesh would then become a DMPlex and it would not retain any DMDA flavour (like notions of which are the grid nodes sitting on top/bottom, left/right of a given node), right? > > Thanks > > Matteo > > -- > Prof. Matteo Semplice > Universit? degli Studi dell?Insubria > Dipartimento di Scienza e Alta Tecnologia ? DiSAT > Professore Associato > Via Valleggio, 11 ? 22100 Como (CO) ? Italia > tel.: +39 031 2386316 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 10 09:39:23 2025 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 10 Oct 2025 10:39:23 -0400 Subject: [petsc-users] extract arbitrary subset of a DMDA In-Reply-To: References: Message-ID: On Fri, Oct 10, 2025 at 9:48?AM Matteo Semplice via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear all, > > I am wondering if there is a way to extract a subset of a DMDA and > use it as a mesh. The use case is to program a finite-difference method > in which the domain is defined by a levelset function: if I could > completely ignore the parts of the background DMDA that are "far away" > from the object, I guess I would avoid some cores having almost no > workload. I figure that I could setup a DMDA, load/compute the levelset > on the entire box, then mark the nodes to be retained, extract the > submesh and repartition it. I would also need a mean to transfer some > Vec data from the DMDA to the new mesh. > > I guess that the extracted mesh would then become a DMPlex and it would > not retain any DMDA flavour (like notions of which are the grid nodes > sitting on top/bottom, left/right of a given node), right? > If you are planning on extracting a Plex anyway, I think it would be easier to just start with a Cartesian Plex, instead of a DA, and use DMPlexCreateSubmesh(). Thanks, Matt > Thanks > > Matteo > > -- > Prof. Matteo Semplice > Universit? degli Studi dell?Insubria > Dipartimento di Scienza e Alta Tecnologia ? DiSAT > Professore Associato > Via Valleggio, 11 ? 22100 Como (CO) ? Italia > tel.: +39 031 2386316 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aTBS1B2YLKRIG_VAvtEwS0a40kgk7MzMmM_K5S1XBiDFyLWFu4yXTeH6Rhx4N3TyI18v_Z0k4cSxEq72EWVS$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Fri Oct 10 10:14:57 2025 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Fri, 10 Oct 2025 17:14:57 +0200 Subject: [petsc-users] extract arbitrary subset of a DMDA In-Reply-To: References: Message-ID: On 10/10/2025 16:39, Matthew Knepley wrote: > On Fri, Oct 10, 2025 at 9:48?AM Matteo Semplice via petsc-users > wrote: > > Dear all, > > ???? I am wondering if there is a way to extract a subset of a > DMDA and > use it as a mesh. The use case is to program a finite-difference > method > in which the domain is defined by a levelset function: if I could > completely ignore the parts of the background DMDA that are "far > away" > from the object, I guess I would avoid some cores having almost no > workload. I figure that I could setup a DMDA, load/compute the > levelset > on the entire box, then mark the nodes to be retained, extract the > submesh and repartition it. I would also need a mean to transfer some > Vec data from the DMDA to the new mesh. > > I guess that the extracted mesh would then become a DMPlex and it > would > not retain any DMDA flavour (like notions of which are the grid nodes > sitting on top/bottom, left/right of a given node), right? > > > If you are planning on extracting a Plex anyway, I think it would be > easier to just > start with a Cartesian Plex, instead of a DA, and use > DMPlexCreateSubmesh(). Hmmm... doable, but I have a couple of questions. By Cartesian Plex you mean a Plex created by DMPlexCreateBoxMesh with simplex=false, right? And, could you point me to the routines that can perform data tranfer from Vecs associated to the DM to the ones asscoiated to the subDM? Is DMPlexGetSubpointIS the way to go? Next, I will load the levelsets from the output of another code that is DA-based and that I'd really like to reuse some code in the setup phase which relies on the DA indexing. So maybe I'd rather, create the DMDA and the associated Vecs, do the setup phase, then DMConvert the DMDA to a "large" DMPlex that covers the entire box, transfer the DA Vecs to the "large" Plex vectors and then extract the submesh. Would this be feasible? If so, can you point me to the routines to transfer the vecs from the dmda to the large plex? Thanks ??? Matteo -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Oct 10 13:16:33 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 10 Oct 2025 14:16:33 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> Message-ID: Elana, Were you able to try the options below? Thanks for reporting the problem, since this is a problem others will face I have attempted to update/fix the PETSc code to make it absolutely clear when no convergence testing is done with https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8777__;!!G_uCfscf7eWS!cBKXt0ucAIZvI8HD3NFbO2mlaAdzaq8mXJAxfoKCfdA7c7UuepziIB8b5w2WRgc--hWKMr9KCgbyr_iua55hPIU$ Barry > On Oct 7, 2025, at 10:53?AM, Barry Smith wrote: > > > I have to apologize again. What you are doing is so out of the ordinary (but there is nothing wrong with you doing it) that I totally lost this line of code > > PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); > > Please try the following, add the options > > -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned > > Barry > > > > > > >> On Oct 7, 2025, at 4:12?AM, Moral Sanchez, Elena wrote: >> >> The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. >> >> Elena >> >> From: Mark Adams > >> Sent: 01 October 2025 13:25:14 >> To: Barry Smith >> Cc: Moral Sanchez, Elena; petsc-users >> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >> >> Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. >> >> On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: >>> >>> Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. >>> >>> Barry >>> >>> You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, >>> >>>> On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: >>>> >>>> This is what I get: >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 2.249726733143e+00 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>> 1 KSP Residual norm 1.433120400946e+00 >>>> 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 >>>> 2 KSP Residual norm 1.169262560123e+00 >>>> 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 >>>> 3 KSP Residual norm 1.323528716607e+00 >>>> 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 >>>> 4 KSP Residual norm 5.006323254234e-01 >>>> 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 >>>> 5 KSP Residual norm 3.569836784785e-01 >>>> 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 >>>> 6 KSP Residual norm 2.493182937513e-01 >>>> 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 >>>> 7 KSP Residual norm 3.038202502298e-01 >>>> 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 >>>> 8 KSP Residual norm 2.780214194402e-01 >>>> 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 >>>> 9 KSP Residual norm 1.676826341491e-01 >>>> 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 >>>> 10 KSP Residual norm 1.209985378713e-01 >>>> 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 >>>> 11 KSP Residual norm 9.445076689969e-02 >>>> 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 >>>> 12 KSP Residual norm 8.308555284580e-02 >>>> 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 >>>> 13 KSP Residual norm 5.472865592585e-02 >>>> 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 >>>> 14 KSP Residual norm 4.357870564398e-02 >>>> 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 >>>> 15 KSP Residual norm 5.079681292439e-02 >>>> 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 5.079681292439e-02 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 >>>> 1 KSP Residual norm 2.934938644003e-02 >>>> 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 >>>> 2 KSP Residual norm 3.257065831294e-02 >>>> 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 >>>> 3 KSP Residual norm 4.143063876867e-02 >>>> 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 >>>> 4 KSP Residual norm 4.822471409489e-02 >>>> 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 >>>> 5 KSP Residual norm 3.197538246153e-02 >>>> 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 >>>> 6 KSP Residual norm 3.461217019835e-02 >>>> 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 >>>> 7 KSP Residual norm 3.410193775327e-02 >>>> 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 >>>> 8 KSP Residual norm 4.690424294464e-02 >>>> 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 >>>> 9 KSP Residual norm 3.366148892800e-02 >>>> 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 >>>> 10 KSP Residual norm 4.068015727689e-02 >>>> 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 >>>> 11 KSP Residual norm 2.658836123104e-02 >>>> 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 >>>> 12 KSP Residual norm 2.826244186003e-02 >>>> 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 >>>> 13 KSP Residual norm 2.981793619508e-02 >>>> 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 >>>> 14 KSP Residual norm 3.525455091450e-02 >>>> 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 >>>> 15 KSP Residual norm 2.331539121838e-02 >>>> 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 2.421498365806e-02 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 >>>> 1 KSP Residual norm 1.761072112362e-02 >>>> 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 >>>> 2 KSP Residual norm 1.400842489042e-02 >>>> 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 >>>> 3 KSP Residual norm 1.419665483348e-02 >>>> 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 >>>> 4 KSP Residual norm 1.617590701667e-02 >>>> 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 >>>> 5 KSP Residual norm 1.354824081005e-02 >>>> 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 >>>> 6 KSP Residual norm 1.387252917475e-02 >>>> 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 >>>> 7 KSP Residual norm 1.514043102087e-02 >>>> 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 >>>> 8 KSP Residual norm 1.275811124745e-02 >>>> 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 >>>> 9 KSP Residual norm 1.241039155981e-02 >>>> 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 >>>> 10 KSP Residual norm 9.585207801652e-03 >>>> 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 >>>> 11 KSP Residual norm 9.022641230732e-03 >>>> 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 >>>> 12 KSP Residual norm 1.187709152046e-02 >>>> 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 >>>> 13 KSP Residual norm 1.084880112494e-02 >>>> 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 >>>> 14 KSP Residual norm 8.194750346781e-03 >>>> 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 >>>> 15 KSP Residual norm 7.614246199165e-03 >>>> 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP Residual norm 7.614246199165e-03 >>>> Residual norms for mg_levels_1_ solve. >>>> 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>>> 1 KSP Residual norm 5.620014684145e-03 >>>> 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 >>>> 2 KSP Residual norm 6.643368363907e-03 >>>> 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 >>>> 3 KSP Residual norm 8.708642393659e-03 >>>> 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 >>>> 4 KSP Residual norm 6.401852907459e-03 >>>> 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 >>>> 5 KSP Residual norm 7.230576215262e-03 >>>> 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 >>>> 6 KSP Residual norm 6.204081601285e-03 >>>> 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 >>>> 7 KSP Residual norm 7.038656665944e-03 >>>> 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 >>>> 8 KSP Residual norm 7.194079694050e-03 >>>> 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 >>>> 9 KSP Residual norm 6.353576889135e-03 >>>> 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 >>>> 10 KSP Residual norm 7.313589502731e-03 >>>> 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 >>>> 11 KSP Residual norm 6.643320423193e-03 >>>> 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 >>>> 12 KSP Residual norm 7.235443182108e-03 >>>> 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 >>>> 13 KSP Residual norm 4.971292307201e-03 >>>> 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 >>>> 14 KSP Residual norm 5.357933842147e-03 >>>> 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 >>>> 15 KSP Residual norm 5.841682994497e-03 >>>> 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 >>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>> Cheers, >>>> Elena >>>> From: Barry Smith > >>>> Sent: 29 September 2025 20:31:26 >>>> To: Moral Sanchez, Elena >>>> Cc: Mark Adams; petsc-users >>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>> >>>> >>>> Thanks. I missed something earlier in the KSPView >>>> >>>>>> using UNPRECONDITIONED norm type for convergence test >>>> >>>> Please add the options >>>> >>>>>>>> -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual >>>> >>>> It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. >>>> >>>> Barry >>>> >>>> >>>>> On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: >>>>> >>>>> This is the output: >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 2.249726733143e+00 >>>>> 1 KSP Residual norm 1.433120400946e+00 >>>>> 2 KSP Residual norm 1.169262560123e+00 >>>>> 3 KSP Residual norm 1.323528716607e+00 >>>>> 4 KSP Residual norm 5.006323254234e-01 >>>>> 5 KSP Residual norm 3.569836784785e-01 >>>>> 6 KSP Residual norm 2.493182937513e-01 >>>>> 7 KSP Residual norm 3.038202502298e-01 >>>>> 8 KSP Residual norm 2.780214194402e-01 >>>>> 9 KSP Residual norm 1.676826341491e-01 >>>>> 10 KSP Residual norm 1.209985378713e-01 >>>>> 11 KSP Residual norm 9.445076689969e-02 >>>>> 12 KSP Residual norm 8.308555284580e-02 >>>>> 13 KSP Residual norm 5.472865592585e-02 >>>>> 14 KSP Residual norm 4.357870564398e-02 >>>>> 15 KSP Residual norm 5.079681292439e-02 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 5.079681292439e-02 >>>>> 1 KSP Residual norm 2.934938644003e-02 >>>>> 2 KSP Residual norm 3.257065831294e-02 >>>>> 3 KSP Residual norm 4.143063876867e-02 >>>>> 4 KSP Residual norm 4.822471409489e-02 >>>>> 5 KSP Residual norm 3.197538246153e-02 >>>>> 6 KSP Residual norm 3.461217019835e-02 >>>>> 7 KSP Residual norm 3.410193775327e-02 >>>>> 8 KSP Residual norm 4.690424294464e-02 >>>>> 9 KSP Residual norm 3.366148892800e-02 >>>>> 10 KSP Residual norm 4.068015727689e-02 >>>>> 11 KSP Residual norm 2.658836123104e-02 >>>>> 12 KSP Residual norm 2.826244186003e-02 >>>>> 13 KSP Residual norm 2.981793619508e-02 >>>>> 14 KSP Residual norm 3.525455091450e-02 >>>>> 15 KSP Residual norm 2.331539121838e-02 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 2.421498365806e-02 >>>>> 1 KSP Residual norm 1.761072112362e-02 >>>>> 2 KSP Residual norm 1.400842489042e-02 >>>>> 3 KSP Residual norm 1.419665483348e-02 >>>>> 4 KSP Residual norm 1.617590701667e-02 >>>>> 5 KSP Residual norm 1.354824081005e-02 >>>>> 6 KSP Residual norm 1.387252917475e-02 >>>>> 7 KSP Residual norm 1.514043102087e-02 >>>>> 8 KSP Residual norm 1.275811124745e-02 >>>>> 9 KSP Residual norm 1.241039155981e-02 >>>>> 10 KSP Residual norm 9.585207801652e-03 >>>>> 11 KSP Residual norm 9.022641230732e-03 >>>>> 12 KSP Residual norm 1.187709152046e-02 >>>>> 13 KSP Residual norm 1.084880112494e-02 >>>>> 14 KSP Residual norm 8.194750346781e-03 >>>>> 15 KSP Residual norm 7.614246199165e-03 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 7.614246199165e-03 >>>>> 1 KSP Residual norm 5.620014684145e-03 >>>>> 2 KSP Residual norm 6.643368363907e-03 >>>>> 3 KSP Residual norm 8.708642393659e-03 >>>>> 4 KSP Residual norm 6.401852907459e-03 >>>>> 5 KSP Residual norm 7.230576215262e-03 >>>>> 6 KSP Residual norm 6.204081601285e-03 >>>>> 7 KSP Residual norm 7.038656665944e-03 >>>>> 8 KSP Residual norm 7.194079694050e-03 >>>>> 9 KSP Residual norm 6.353576889135e-03 >>>>> 10 KSP Residual norm 7.313589502731e-03 >>>>> 11 KSP Residual norm 6.643320423193e-03 >>>>> 12 KSP Residual norm 7.235443182108e-03 >>>>> 13 KSP Residual norm 4.971292307201e-03 >>>>> 14 KSP Residual norm 5.357933842147e-03 >>>>> 15 KSP Residual norm 5.841682994497e-03 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> >>>>> From: Barry Smith > >>>>> Sent: 29 September 2025 15:56:33 >>>>> To: Moral Sanchez, Elena >>>>> Cc: Mark Adams; petsc-users >>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>> >>>>> >>>>> I asked you to run with >>>>> >>>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>> >>>>> you chose not to, delaying the process of understanding what is happening. >>>>> >>>>> Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. >>>>> >>>>> Barry >>>>> >>>>> >>>>>> On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: >>>>>> >>>>>> Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. >>>>>> >>>>>> I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. >>>>>> >>>>>> From: Mark Adams > >>>>>> Sent: 29 September 2025 14:20:56 >>>>>> To: Moral Sanchez, Elena >>>>>> Cc: Barry Smith; petsc-users >>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>> >>>>>> Oh I see the coarse grid solver in your full solver output now. >>>>>> You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. >>>>>> >>>>>> On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: >>>>>>> Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: >>>>>>> CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s >>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s >>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s >>>>>>> ConvergedReason MG lvl -1: 3 >>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s >>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s >>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s >>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>> CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s >>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s >>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s >>>>>>> ConvergedReason MG lvl -1: 3 >>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s >>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s >>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s >>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>> CG ConvergedReason: -3 >>>>>>> >>>>>>> For completeness, I add here the -ksp_view of the whole solver: >>>>>>> KSP Object: 1 MPI process >>>>>>> type: cg >>>>>>> variant HERMITIAN >>>>>>> maximum iterations=1, nonzero initial guess >>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>> left preconditioning >>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: mg >>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>> Cycles per PCApply=1 >>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>> type: cg >>>>>>> variant HERMITIAN >>>>>>> maximum iterations=15, nonzero initial guess >>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>> left preconditioning >>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: python >>>>>>> rows=524, cols=524 >>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>> type: cg >>>>>>> variant HERMITIAN >>>>>>> maximum iterations=15, nonzero initial guess >>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>> left preconditioning >>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: python >>>>>>> rows=884, cols=884 >>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: python >>>>>>> rows=884, cols=884 >>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>> >>>>>>> Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. >>>>>>> Thanks for the help, >>>>>>> Elena >>>>>>> >>>>>>> From: Mark Adams > >>>>>>> Sent: 28 September 2025 20:13:54 >>>>>>> To: Barry Smith >>>>>>> Cc: Moral Sanchez, Elena; petsc-users >>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>> >>>>>>> Not sure why your "whole"solver does not have a coarse grid but this is wrong: >>>>>>> >>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> >>>>>>>> The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) >>>>>>> >>>>>>> On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: >>>>>>>> Looks reasonable. Send the output running with >>>>>>>> >>>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>>>>> >>>>>>>>> On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: >>>>>>>>> >>>>>>>>> Dear Barry, >>>>>>>>> >>>>>>>>> This is -ksp_view for the smoother at the finest level: >>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=524, cols=524 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> And at the coarsest level: >>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=344, cols=344 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> And for the whole solver: >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=100, nonzero initial guess >>>>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: mg >>>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>>> Cycles per PCApply=1 >>>>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=344, cols=344 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=524, cols=524 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=524, cols=524 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> Best, >>>>>>>>> Elena >>>>>>>>> >>>>>>>>> >>>>>>>>> From: Barry Smith > >>>>>>>>> Sent: 26 September 2025 19:05:02 >>>>>>>>> To: Moral Sanchez, Elena >>>>>>>>> Cc: petsc-users at mcs.anl.gov >>>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>>> >>>>>>>>> >>>>>>>>> Send the output using -ksp_view >>>>>>>>> >>>>>>>>> Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. >>>>>>>>>> >>>>>>>>>> What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. >>>>>>>>>> >>>>>>>>>> I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. >>>>>>>>>> >>>>>>>>>> Do you know how to make the smoother at the finest level stop when reaching the tolerance? >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Elena. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Oct 11 16:19:05 2025 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 11 Oct 2025 17:19:05 -0400 Subject: [petsc-users] extract arbitrary subset of a DMDA In-Reply-To: References: Message-ID: On Fri, Oct 10, 2025 at 11:15?AM Matteo Semplice < matteo.semplice at uninsubria.it> wrote: > On 10/10/2025 16:39, Matthew Knepley wrote: > > On Fri, Oct 10, 2025 at 9:48?AM Matteo Semplice via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear all, >> >> I am wondering if there is a way to extract a subset of a DMDA and >> use it as a mesh. The use case is to program a finite-difference method >> in which the domain is defined by a levelset function: if I could >> completely ignore the parts of the background DMDA that are "far away" >> from the object, I guess I would avoid some cores having almost no >> workload. I figure that I could setup a DMDA, load/compute the levelset >> on the entire box, then mark the nodes to be retained, extract the >> submesh and repartition it. I would also need a mean to transfer some >> Vec data from the DMDA to the new mesh. >> >> I guess that the extracted mesh would then become a DMPlex and it would >> not retain any DMDA flavour (like notions of which are the grid nodes >> sitting on top/bottom, left/right of a given node), right? >> > > If you are planning on extracting a Plex anyway, I think it would be > easier to just > start with a Cartesian Plex, instead of a DA, and use > DMPlexCreateSubmesh(). > > Hmmm... doable, but I have a couple of questions. > > By Cartesian Plex you mean a Plex created by DMPlexCreateBoxMesh with > simplex=false, right? > Yes, exactly > And, could you point me to the routines that can perform data tranfer from > Vecs associated to the DM to the ones asscoiated to the subDM? Is > DMPlexGetSubpointIS the way to go? > Yes. You get the IS and then you can use https://urldefense.us/v3/__https://petsc.org/main/manualpages/Vec/VecISCopy/__;!!G_uCfscf7eWS!ZuU3KyjiAxEurYfyBPAqLwse3wiBnKnWg7CgVwR0lJDOM2Rp3loC13vFnsgCTd7FcZMhSe0CAY8ingyIUWji$ > Next, I will load the levelsets from the output of another code that is > DA-based and that I'd really like to reuse some code in the setup phase > which relies on the DA indexing. So maybe I'd rather, create the DMDA and > the associated Vecs, do the setup phase, then DMConvert the DMDA to a > "large" DMPlex that covers the entire box, transfer the DA Vecs to the > "large" Plex vectors and then extract the submesh. Would this be feasible? > If so, can you point me to the routines to transfer the vecs from the dmda > to the large plex? > Yes, this sounds doable, and once we write it, we should just put it in the library. In serial this is completely trivial. The DA has a known ordering and the Plex has a known ordering. We can use an AO or just a VecScatter to permute the Vec. In parallel, the DA is partitioned geometrically, whereas the Plex, by default, is partitioned using a graph partitioner, like ParMetis. To me, if this project is important, it seems worth it to build a simple partitioner for the Plex that mimics the DA. Then we are back to the trivial remapping. Finally, we probably want the Plex dual to the DA due to the way that DAs partition things. Thanks, Matt > Thanks > > Matteo > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZuU3KyjiAxEurYfyBPAqLwse3wiBnKnWg7CgVwR0lJDOM2Rp3loC13vFnsgCTd7FcZMhSe0CAY8inhPGfPNx$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel.salazar at corintis.com Mon Oct 13 01:52:12 2025 From: miguel.salazar at corintis.com (Miguel Salazar) Date: Mon, 13 Oct 2025 06:52:12 +0000 Subject: [petsc-users] Solver/Preconditioner for advection diffusion equation References: <0e58dc19-e943-4d42-9a05-4cabbcbc8185.d0d10b7c-e93e-4ca5-8394-d4339a327c55.3c5a0b5e-a722-4f6c-b0c9-6ffb2aef105d@emailsignatures365.codetwo.com> Message-ID: Hi, Is there an off-the-shelf solver or preconditioner in PETSc to handle advection-diffusion equations with large Peclet numbers (Pe > 2000)? The equations are solved with the FEM and ideally, one could just use the solver through the PETSc settings without having to set up any mesh hierarchy or partitioning (no geometric multigrid or domain decomposition). I read that the Approximate Ideal Restriction method is appropriate for this kind of problems, but the documentation is scarce. Any pointers are welcome. Thanks, Miguel MIGUEL ANGEL SALAZAR DE TROYA Head of Software Engineering miguel.salazar at corintis.com Corintis SA EPFL Innovation Park Building C 1015 Lausanne [https://urldefense.us/v3/__https://storcor.s3.eu-central-1.amazonaws.com/logos/Logo-black.png__;!!G_uCfscf7eWS!ffVhBy46zB5AmzYedi4T90u57oUykOFrmwLZYb73Ui8o5yP_F4JlTA49SFau-ljj-92S4CrYA30sIJI2MR2FTJbt0oZDr8V1$ ] Here at Corintis we care for your privacy. That is why we have taken appropriate measures to ensure that the data you have provided to us is always secure -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 13 07:19:03 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 13 Oct 2025 08:19:03 -0400 Subject: [petsc-users] Solver/Preconditioner for advection diffusion equation In-Reply-To: References: <0e58dc19-e943-4d42-9a05-4cabbcbc8185.d0d10b7c-e93e-4ca5-8394-d4339a327c55.3c5a0b5e-a722-4f6c-b0c9-6ffb2aef105d@emailsignatures365.codetwo.com> Message-ID: On Mon, Oct 13, 2025 at 2:52?AM Miguel Salazar wrote: > Hi, > > > > Is there an off-the-shelf solver or preconditioner in PETSc to handle > advection-diffusion equations with large Peclet numbers (Pe > 2000)? The > equations are solved with the FEM and ideally, one could just use the > solver through the PETSc settings without having to set up any mesh > hierarchy or partitioning (no geometric multigrid or domain decomposition). > I read that the Approximate Ideal Restriction method is appropriate for > this kind of problems, but the documentation is scarce. Any pointers are > welcome. > Hi Miguel, We are in the process of integrating PFLARE ( https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8776__;!!G_uCfscf7eWS!ZJ-UU8WRuA_VDt1Dm2hWiknuosHNCBYsXeE5bRhQvYUKO5f9hYqWqO0EPybEBT8sF5mV_wS98QMoE9AUh7nc$ ), which is an AIR package from Steve Dargaville at Imperial. Hopefully in the next few days. You could try out the branch until then and let us know if you have problems. Thanks, Matt > Thanks, > > Miguel > > > > > *MIGUEL ANGEL SALAZAR DE TROYA*Head of Software Engineering > *miguel.salazar at corintis.com* > > Corintis SA > EPFL Innovation Park Building C > 1015 Lausanne > > > > [image: Logo-black.png] > Here at Corintis we care for your privacy. That is why we have taken > appropriate measures to ensure that the data you have provided to us is > always secure > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZJ-UU8WRuA_VDt1Dm2hWiknuosHNCBYsXeE5bRhQvYUKO5f9hYqWqO0EPybEBT8sF5mV_wS98QMoE-9w9dE-$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Mon Oct 13 10:17:39 2025 From: matteo.semplice at uninsubria.it (Semplice Matteo) Date: Mon, 13 Oct 2025 15:17:39 +0000 Subject: [petsc-users] extract arbitrary subset of a DMDA In-Reply-To: References: Message-ID: Inviato da Outlook per Android ________________________________ Da: Matthew Knepley Inviato: Sabato, Ottobre 11, 2025 11:19:23 PM A: Semplice Matteo Cc: PETSc Oggetto: Re: [petsc-users] extract arbitrary subset of a DMDA On Fri, Oct 10, 2025 at 11:15?AM Matteo Semplice > wrote: On 10/10/2025 16:39, Matthew Knepley wrote: On Fri, Oct 10, 2025 at 9:48?AM Matteo Semplice via petsc-users > wrote: Dear all, I am wondering if there is a way to extract a subset of a DMDA and use it as a mesh. The use case is to program a finite-difference method in which the domain is defined by a levelset function: if I could completely ignore the parts of the background DMDA that are "far away" from the object, I guess I would avoid some cores having almost no workload. I figure that I could setup a DMDA, load/compute the levelset on the entire box, then mark the nodes to be retained, extract the submesh and repartition it. I would also need a mean to transfer some Vec data from the DMDA to the new mesh. I guess that the extracted mesh would then become a DMPlex and it would not retain any DMDA flavour (like notions of which are the grid nodes sitting on top/bottom, left/right of a given node), right? If you are planning on extracting a Plex anyway, I think it would be easier to just start with a Cartesian Plex, instead of a DA, and use DMPlexCreateSubmesh(). Hmmm... doable, but I have a couple of questions. By Cartesian Plex you mean a Plex created by DMPlexCreateBoxMesh with simplex=false, right? Yes, exactly And, could you point me to the routines that can perform data tranfer from Vecs associated to the DM to the ones asscoiated to the subDM? Is DMPlexGetSubpointIS the way to go? Yes. You get the IS and then you can use https://urldefense.us/v3/__https://petsc.org/main/manualpages/Vec/VecISCopy/__;!!G_uCfscf7eWS!fqtSeAyod1b2Xvs6xIj79ibX2IOgMwkxFyGY_V3CUj0CvphYw7SOJfqGyHEwfjbE1bLwzgUgz8wZRvDWCT0gByX2LMcN2kP-fgVdgg$ Next, I will load the levelsets from the output of another code that is DA-based and that I'd really like to reuse some code in the setup phase which relies on the DA indexing. So maybe I'd rather, create the DMDA and the associated Vecs, do the setup phase, then DMConvert the DMDA to a "large" DMPlex that covers the entire box, transfer the DA Vecs to the "large" Plex vectors and then extract the submesh. Would this be feasible? If so, can you point me to the routines to transfer the vecs from the dmda to the large plex? Yes, this sounds doable, and once we write it, we should just put it in the library. In serial this is completely trivial. The DA has a known ordering and the Plex has a known ordering. We can use an AO or just a VecScatter to permute the Vec. In parallel, the DA is partitioned geometrically, whereas the Plex, by default, is partitioned using a graph partitioner, like ParMetis. To me, if this project is important, it seems worth it to build a simple partitioner for the Plex that mimics the DA. Then we are back to the trivial remapping. Finally, we probably want the Plex dual to the DA due to the way that DAs partition things. ____ Hi Matt, doing some research on parallel Plex I found this thread on the list from 2020 https://lists.mcs.anl.gov/pipermail/petsc-users/2020-August/041934.html I will explore this way to create a basic cartesian Plex from my distributed da. I think that there are enough info there, but I will come back if I run into troubles. Or do you suggest that I should instead study the private methods used in DMPlexCreateBoxMesh? Thanks Matteo -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 13 10:29:52 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 13 Oct 2025 11:29:52 -0400 Subject: [petsc-users] extract arbitrary subset of a DMDA In-Reply-To: References: Message-ID: On Mon, Oct 13, 2025 at 11:17?AM Semplice Matteo < matteo.semplice at uninsubria.it> wrote: > > > Inviato da Outlook per Android > > ------------------------------ > *Da:* Matthew Knepley > *Inviato:* Sabato, Ottobre 11, 2025 11:19:23 PM > *A:* Semplice Matteo > *Cc:* PETSc > *Oggetto:* Re: [petsc-users] extract arbitrary subset of a DMDA > > On Fri, Oct 10, 2025 at 11:15?AM Matteo Semplice < > matteo.semplice at uninsubria.it> wrote: > > On 10/10/2025 16:39, Matthew Knepley wrote: > > On Fri, Oct 10, 2025 at 9:48?AM Matteo Semplice via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Dear all, > > I am wondering if there is a way to extract a subset of a DMDA and > use it as a mesh. The use case is to program a finite-difference method > in which the domain is defined by a levelset function: if I could > completely ignore the parts of the background DMDA that are "far away" > from the object, I guess I would avoid some cores having almost no > workload. I figure that I could setup a DMDA, load/compute the levelset > on the entire box, then mark the nodes to be retained, extract the > submesh and repartition it. I would also need a mean to transfer some > Vec data from the DMDA to the new mesh. > > I guess that the extracted mesh would then become a DMPlex and it would > not retain any DMDA flavour (like notions of which are the grid nodes > sitting on top/bottom, left/right of a given node), right? > > > If you are planning on extracting a Plex anyway, I think it would be > easier to just > start with a Cartesian Plex, instead of a DA, and use > DMPlexCreateSubmesh(). > > Hmmm... doable, but I have a couple of questions. > > By Cartesian Plex you mean a Plex created by DMPlexCreateBoxMesh with > simplex=false, right? > > Yes, exactly > > And, could you point me to the routines that can perform data tranfer from > Vecs associated to the DM to the ones asscoiated to the subDM? Is > DMPlexGetSubpointIS the way to go? > > Yes. You get the IS and then you can use > > https://urldefense.us/v3/__https://petsc.org/main/manualpages/Vec/VecISCopy/__;!!G_uCfscf7eWS!bufWgyONsvb66zmOv6LgtJsh0Yr4dFXIovj8jn6nt7Ugtd0-U5Tthdyj_F0vRS7qG9mtw4BzudiTCi5Y9fxK$ > > Next, I will load the levelsets from the output of another code that is > DA-based and that I'd really like to reuse some code in the setup phase > which relies on the DA indexing. So maybe I'd rather, create the DMDA and > the associated Vecs, do the setup phase, then DMConvert the DMDA to a > "large" DMPlex that covers the entire box, transfer the DA Vecs to the > "large" Plex vectors and then extract the submesh. Would this be feasible? > If so, can you point me to the routines to transfer the vecs from the dmda > to the large plex? > > Yes, this sounds doable, and once we write it, we should just put it in > the library. > > In serial this is completely trivial. The DA has a known ordering and the > Plex has a known ordering. We can use an AO > or just a VecScatter to permute the Vec. In parallel, the DA is > partitioned geometrically, whereas the Plex, by default, is partitioned > using a graph partitioner, like ParMetis. To me, if this project is > important, it seems worth it to build a simple partitioner for the Plex > that mimics the DA. Then we are back to the trivial remapping. Finally, we > probably want the Plex dual to the DA due to the way that DAs partition > things. > > ____ > Hi Matt, > doing some research on parallel Plex I found this thread on the list > from 2020 > > https://lists.mcs.anl.gov/pipermail/petsc-users/2020-August/041934.html > > I will explore this way to create a basic cartesian Plex from my > distributed da. I think that there are enough info there, but I will come > back if I run into troubles. > Sure this is doable. Creating the SF is morally the same thing as creating the partition I was asking for with CreateBoxMesh. As I mentioned at the end of the last post, DA partitions "vertices" and shares "edges", which is not how we normally think in finite element land. If you are using the Plex for finite difference, then I would just match the DA. If you are using it for finite elements, I would create the dual mesh instead. Thanks, Matt > Or do you suggest that I should instead study the private methods used in > DMPlexCreateBoxMesh? > > Thanks > Matteo > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bufWgyONsvb66zmOv6LgtJsh0Yr4dFXIovj8jn6nt7Ugtd0-U5Tthdyj_F0vRS7qG9mtw4BzudiTCoQl4j2n$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Elena.Moral.Sanchez at ipp.mpg.de Tue Oct 14 09:03:22 2025 From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena) Date: Tue, 14 Oct 2025 14:03:22 +0000 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> , Message-ID: <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> Dear Barry, Sorry for the delay in my answer. I am using petsc4py. I tried to find an equivalent KSP method to replicate your line: PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); I tried to use KSP.setConvergenceTest ( https://urldefense.us/v3/__https://petsc.org/main/petsc4py/reference/petsc4py.PETSc.KSP.html*petsc4py.PETSc.KSP.setConvergenceTest__;Iw!!G_uCfscf7eWS!Zxdpnb4L9QkJUCsuoqXVnZ25qEV4PdyJ0nC2HkFbzqden5Sf4lzzlHJWRyXCYcEgiDsp9xN6NtRSmnO7n-F_EvXIpkMqg7ab-Gu8$ ). The first argument must be a callable but I am uncertain at what to pass to replicate KSPConvergedSkip. Cheers, Elena ________________________________ From: Barry Smith Sent: 10 October 2025 20:16:33 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Elana, Were you able to try the options below? Thanks for reporting the problem, since this is a problem others will face I have attempted to update/fix the PETSc code to make it absolutely clear when no convergence testing is done with https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8777__;!!G_uCfscf7eWS!Zxdpnb4L9QkJUCsuoqXVnZ25qEV4PdyJ0nC2HkFbzqden5Sf4lzzlHJWRyXCYcEgiDsp9xN6NtRSmnO7n-F_EvXIpkMqg4x-xP43$ Barry On Oct 7, 2025, at 10:53?AM, Barry Smith wrote: I have to apologize again. What you are doing is so out of the ordinary (but there is nothing wrong with you doing it) that I totally lost this line of code PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); Please try the following, add the options -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned Barry On Oct 7, 2025, at 4:12?AM, Moral Sanchez, Elena wrote: The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. Elena ________________________________ From: Mark Adams > Sent: 01 October 2025 13:25:14 To: Barry Smith Cc: Moral Sanchez, Elena; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. Barry You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: This is what I get: Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.249726733143e+00 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.433120400946e+00 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 2 KSP Residual norm 1.169262560123e+00 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 3 KSP Residual norm 1.323528716607e+00 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 4 KSP Residual norm 5.006323254234e-01 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 5 KSP Residual norm 3.569836784785e-01 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 6 KSP Residual norm 2.493182937513e-01 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 7 KSP Residual norm 3.038202502298e-01 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 8 KSP Residual norm 2.780214194402e-01 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 9 KSP Residual norm 1.676826341491e-01 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 10 KSP Residual norm 1.209985378713e-01 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 11 KSP Residual norm 9.445076689969e-02 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 12 KSP Residual norm 8.308555284580e-02 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 13 KSP Residual norm 5.472865592585e-02 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 14 KSP Residual norm 4.357870564398e-02 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 15 KSP Residual norm 5.079681292439e-02 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 5.079681292439e-02 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 1 KSP Residual norm 2.934938644003e-02 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 2 KSP Residual norm 3.257065831294e-02 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 3 KSP Residual norm 4.143063876867e-02 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 4 KSP Residual norm 4.822471409489e-02 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 5 KSP Residual norm 3.197538246153e-02 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 6 KSP Residual norm 3.461217019835e-02 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 7 KSP Residual norm 3.410193775327e-02 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 8 KSP Residual norm 4.690424294464e-02 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 9 KSP Residual norm 3.366148892800e-02 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 10 KSP Residual norm 4.068015727689e-02 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 11 KSP Residual norm 2.658836123104e-02 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 12 KSP Residual norm 2.826244186003e-02 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 13 KSP Residual norm 2.981793619508e-02 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 14 KSP Residual norm 3.525455091450e-02 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 15 KSP Residual norm 2.331539121838e-02 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.421498365806e-02 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.761072112362e-02 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 2 KSP Residual norm 1.400842489042e-02 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 3 KSP Residual norm 1.419665483348e-02 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 4 KSP Residual norm 1.617590701667e-02 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 5 KSP Residual norm 1.354824081005e-02 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 6 KSP Residual norm 1.387252917475e-02 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 7 KSP Residual norm 1.514043102087e-02 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 8 KSP Residual norm 1.275811124745e-02 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 9 KSP Residual norm 1.241039155981e-02 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 10 KSP Residual norm 9.585207801652e-03 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 11 KSP Residual norm 9.022641230732e-03 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 12 KSP Residual norm 1.187709152046e-02 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 13 KSP Residual norm 1.084880112494e-02 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 14 KSP Residual norm 8.194750346781e-03 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 15 KSP Residual norm 7.614246199165e-03 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 7.614246199165e-03 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 1 KSP Residual norm 5.620014684145e-03 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 2 KSP Residual norm 6.643368363907e-03 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 3 KSP Residual norm 8.708642393659e-03 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 4 KSP Residual norm 6.401852907459e-03 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 5 KSP Residual norm 7.230576215262e-03 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 6 KSP Residual norm 6.204081601285e-03 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 7 KSP Residual norm 7.038656665944e-03 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 8 KSP Residual norm 7.194079694050e-03 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 9 KSP Residual norm 6.353576889135e-03 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 10 KSP Residual norm 7.313589502731e-03 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 11 KSP Residual norm 6.643320423193e-03 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 12 KSP Residual norm 7.235443182108e-03 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 13 KSP Residual norm 4.971292307201e-03 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 14 KSP Residual norm 5.357933842147e-03 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 15 KSP Residual norm 5.841682994497e-03 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Cheers, Elena ________________________________ From: Barry Smith > Sent: 29 September 2025 20:31:26 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Thanks. I missed something earlier in the KSPView using UNPRECONDITIONED norm type for convergence test Please add the options -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. Barry On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: This is the output: Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.249726733143e+00 1 KSP Residual norm 1.433120400946e+00 2 KSP Residual norm 1.169262560123e+00 3 KSP Residual norm 1.323528716607e+00 4 KSP Residual norm 5.006323254234e-01 5 KSP Residual norm 3.569836784785e-01 6 KSP Residual norm 2.493182937513e-01 7 KSP Residual norm 3.038202502298e-01 8 KSP Residual norm 2.780214194402e-01 9 KSP Residual norm 1.676826341491e-01 10 KSP Residual norm 1.209985378713e-01 11 KSP Residual norm 9.445076689969e-02 12 KSP Residual norm 8.308555284580e-02 13 KSP Residual norm 5.472865592585e-02 14 KSP Residual norm 4.357870564398e-02 15 KSP Residual norm 5.079681292439e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 5.079681292439e-02 1 KSP Residual norm 2.934938644003e-02 2 KSP Residual norm 3.257065831294e-02 3 KSP Residual norm 4.143063876867e-02 4 KSP Residual norm 4.822471409489e-02 5 KSP Residual norm 3.197538246153e-02 6 KSP Residual norm 3.461217019835e-02 7 KSP Residual norm 3.410193775327e-02 8 KSP Residual norm 4.690424294464e-02 9 KSP Residual norm 3.366148892800e-02 10 KSP Residual norm 4.068015727689e-02 11 KSP Residual norm 2.658836123104e-02 12 KSP Residual norm 2.826244186003e-02 13 KSP Residual norm 2.981793619508e-02 14 KSP Residual norm 3.525455091450e-02 15 KSP Residual norm 2.331539121838e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.421498365806e-02 1 KSP Residual norm 1.761072112362e-02 2 KSP Residual norm 1.400842489042e-02 3 KSP Residual norm 1.419665483348e-02 4 KSP Residual norm 1.617590701667e-02 5 KSP Residual norm 1.354824081005e-02 6 KSP Residual norm 1.387252917475e-02 7 KSP Residual norm 1.514043102087e-02 8 KSP Residual norm 1.275811124745e-02 9 KSP Residual norm 1.241039155981e-02 10 KSP Residual norm 9.585207801652e-03 11 KSP Residual norm 9.022641230732e-03 12 KSP Residual norm 1.187709152046e-02 13 KSP Residual norm 1.084880112494e-02 14 KSP Residual norm 8.194750346781e-03 15 KSP Residual norm 7.614246199165e-03 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 7.614246199165e-03 1 KSP Residual norm 5.620014684145e-03 2 KSP Residual norm 6.643368363907e-03 3 KSP Residual norm 8.708642393659e-03 4 KSP Residual norm 6.401852907459e-03 5 KSP Residual norm 7.230576215262e-03 6 KSP Residual norm 6.204081601285e-03 7 KSP Residual norm 7.038656665944e-03 8 KSP Residual norm 7.194079694050e-03 9 KSP Residual norm 6.353576889135e-03 10 KSP Residual norm 7.313589502731e-03 11 KSP Residual norm 6.643320423193e-03 12 KSP Residual norm 7.235443182108e-03 13 KSP Residual norm 4.971292307201e-03 14 KSP Residual norm 5.357933842147e-03 15 KSP Residual norm 5.841682994497e-03 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 ________________________________ From: Barry Smith > Sent: 29 September 2025 15:56:33 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level I asked you to run with -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason you chose not to, delaying the process of understanding what is happening. Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. Barry On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. ________________________________ From: Mark Adams > Sent: 29 September 2025 14:20:56 To: Moral Sanchez, Elena Cc: Barry Smith; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Oh I see the coarse grid solver in your full solver output now. You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 4 CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 CG ConvergedReason: -3 For completeness, I add here the -ksp_view of the whole solver: KSP Object: 1 MPI process type: cg variant HERMITIAN maximum iterations=1, nonzero initial guess tolerances: relative=1e-08, absolute=1e-09, divergence=10000. left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: mg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level 0 ------------------------------- KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=15, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=15, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=884, cols=884 Python: Solver_petsc.LeastSquaresOperator Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=884, cols=884 Python: Solver_petsc.LeastSquaresOperator Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. Thanks for the help, Elena ________________________________ From: Mark Adams > Sent: 28 September 2025 20:13:54 To: Barry Smith Cc: Moral Sanchez, Elena; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Not sure why your "whole"solver does not have a coarse grid but this is wrong: KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: Looks reasonable. Send the output running with -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: Dear Barry, This is -ksp_view for the smoother at the finest level: KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=10, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator And at the coarsest level: KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=344, cols=344 Python: Solver_petsc.LeastSquaresOperator And for the whole solver: KSP Object: 1 MPI process type: cg variant HERMITIAN maximum iterations=100, nonzero initial guess tolerances: relative=1e-08, absolute=1e-09, divergence=10000. left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: mg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level 0 ------------------------------- KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=344, cols=344 Python: Solver_petsc.LeastSquaresOperator Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=10, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Best, Elena ________________________________ From: Barry Smith > Sent: 26 September 2025 19:05:02 To: Moral Sanchez, Elena Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Send the output using -ksp_view Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. Barry On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: Hi, I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. Do you know how to make the smoother at the finest level stop when reaching the tolerance? Cheers, Elena. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Oct 14 10:33:42 2025 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 14 Oct 2025 11:33:42 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> Message-ID: <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> Sorry, I was not clear, use the options > -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned don't use the KSPSetConvergenceTest(). Barry > On Oct 14, 2025, at 10:03?AM, Moral Sanchez, Elena wrote: > > Dear Barry, > > Sorry for the delay in my answer. > > I am using petsc4py. I tried to find an equivalent KSP method to replicate your line: > > PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); > > I tried to use KSP.setConvergenceTest ( https://urldefense.us/v3/__https://petsc.org/main/petsc4py/reference/petsc4py.PETSc.KSP.html*petsc4py.PETSc.KSP.setConvergenceTest__;Iw!!G_uCfscf7eWS!YOJjSXBY9qJ7RnEShgkmeSYJqQzpFtH59ZGqFNTlyVNh4UIOd3SXpp0hjMoFbmbYsbT4nZSpUdwlTGwORrxpmb4$ ). The first argument must be a callable but I am uncertain at what to pass to replicate KSPConvergedSkip. > > Cheers, > Elena > > From: Barry Smith > > Sent: 10 October 2025 20:16:33 > To: Moral Sanchez, Elena > Cc: Mark Adams; petsc-users > Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level > > > Elana, > > Were you able to try the options below? > > Thanks for reporting the problem, since this is a problem others will face I have attempted to update/fix the PETSc code to make it absolutely clear when no convergence testing is done with https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8777__;!!G_uCfscf7eWS!YOJjSXBY9qJ7RnEShgkmeSYJqQzpFtH59ZGqFNTlyVNh4UIOd3SXpp0hjMoFbmbYsbT4nZSpUdwlTGwO-4MU5sQ$ > > Barry > > >> On Oct 7, 2025, at 10:53?AM, Barry Smith > wrote: >> >> >> I have to apologize again. What you are doing is so out of the ordinary (but there is nothing wrong with you doing it) that I totally lost this line of code >> >> PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); >> >> Please try the following, add the options >> >> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned >> >> Barry >> >> >> >> >> >> >>> On Oct 7, 2025, at 4:12?AM, Moral Sanchez, Elena > wrote: >>> >>> The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. >>> >>> Elena >>> >>> >>> From: Mark Adams > >>> Sent: 01 October 2025 13:25:14 >>> To: Barry Smith >>> Cc: Moral Sanchez, Elena; petsc-users >>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>> >>> Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. >>> >>> On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: >>>> >>>> Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. >>>> >>>> Barry >>>> >>>> You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, >>>> >>>>> On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: >>>>> >>>>> This is what I get: >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 2.249726733143e+00 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>>> 1 KSP Residual norm 1.433120400946e+00 >>>>> 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 >>>>> 2 KSP Residual norm 1.169262560123e+00 >>>>> 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 >>>>> 3 KSP Residual norm 1.323528716607e+00 >>>>> 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 >>>>> 4 KSP Residual norm 5.006323254234e-01 >>>>> 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 >>>>> 5 KSP Residual norm 3.569836784785e-01 >>>>> 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 >>>>> 6 KSP Residual norm 2.493182937513e-01 >>>>> 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 >>>>> 7 KSP Residual norm 3.038202502298e-01 >>>>> 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 >>>>> 8 KSP Residual norm 2.780214194402e-01 >>>>> 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 >>>>> 9 KSP Residual norm 1.676826341491e-01 >>>>> 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 >>>>> 10 KSP Residual norm 1.209985378713e-01 >>>>> 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 >>>>> 11 KSP Residual norm 9.445076689969e-02 >>>>> 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 >>>>> 12 KSP Residual norm 8.308555284580e-02 >>>>> 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 >>>>> 13 KSP Residual norm 5.472865592585e-02 >>>>> 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 >>>>> 14 KSP Residual norm 4.357870564398e-02 >>>>> 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 >>>>> 15 KSP Residual norm 5.079681292439e-02 >>>>> 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 5.079681292439e-02 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 >>>>> 1 KSP Residual norm 2.934938644003e-02 >>>>> 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 >>>>> 2 KSP Residual norm 3.257065831294e-02 >>>>> 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 >>>>> 3 KSP Residual norm 4.143063876867e-02 >>>>> 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 >>>>> 4 KSP Residual norm 4.822471409489e-02 >>>>> 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 >>>>> 5 KSP Residual norm 3.197538246153e-02 >>>>> 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 >>>>> 6 KSP Residual norm 3.461217019835e-02 >>>>> 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 >>>>> 7 KSP Residual norm 3.410193775327e-02 >>>>> 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 >>>>> 8 KSP Residual norm 4.690424294464e-02 >>>>> 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 >>>>> 9 KSP Residual norm 3.366148892800e-02 >>>>> 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 >>>>> 10 KSP Residual norm 4.068015727689e-02 >>>>> 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 >>>>> 11 KSP Residual norm 2.658836123104e-02 >>>>> 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 >>>>> 12 KSP Residual norm 2.826244186003e-02 >>>>> 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 >>>>> 13 KSP Residual norm 2.981793619508e-02 >>>>> 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 >>>>> 14 KSP Residual norm 3.525455091450e-02 >>>>> 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 >>>>> 15 KSP Residual norm 2.331539121838e-02 >>>>> 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 2.421498365806e-02 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 >>>>> 1 KSP Residual norm 1.761072112362e-02 >>>>> 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 >>>>> 2 KSP Residual norm 1.400842489042e-02 >>>>> 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 >>>>> 3 KSP Residual norm 1.419665483348e-02 >>>>> 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 >>>>> 4 KSP Residual norm 1.617590701667e-02 >>>>> 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 >>>>> 5 KSP Residual norm 1.354824081005e-02 >>>>> 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 >>>>> 6 KSP Residual norm 1.387252917475e-02 >>>>> 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 >>>>> 7 KSP Residual norm 1.514043102087e-02 >>>>> 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 >>>>> 8 KSP Residual norm 1.275811124745e-02 >>>>> 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 >>>>> 9 KSP Residual norm 1.241039155981e-02 >>>>> 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 >>>>> 10 KSP Residual norm 9.585207801652e-03 >>>>> 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 >>>>> 11 KSP Residual norm 9.022641230732e-03 >>>>> 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 >>>>> 12 KSP Residual norm 1.187709152046e-02 >>>>> 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 >>>>> 13 KSP Residual norm 1.084880112494e-02 >>>>> 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 >>>>> 14 KSP Residual norm 8.194750346781e-03 >>>>> 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 >>>>> 15 KSP Residual norm 7.614246199165e-03 >>>>> 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP Residual norm 7.614246199165e-03 >>>>> Residual norms for mg_levels_1_ solve. >>>>> 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>>>> 1 KSP Residual norm 5.620014684145e-03 >>>>> 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 >>>>> 2 KSP Residual norm 6.643368363907e-03 >>>>> 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 >>>>> 3 KSP Residual norm 8.708642393659e-03 >>>>> 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 >>>>> 4 KSP Residual norm 6.401852907459e-03 >>>>> 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 >>>>> 5 KSP Residual norm 7.230576215262e-03 >>>>> 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 >>>>> 6 KSP Residual norm 6.204081601285e-03 >>>>> 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 >>>>> 7 KSP Residual norm 7.038656665944e-03 >>>>> 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 >>>>> 8 KSP Residual norm 7.194079694050e-03 >>>>> 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 >>>>> 9 KSP Residual norm 6.353576889135e-03 >>>>> 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 >>>>> 10 KSP Residual norm 7.313589502731e-03 >>>>> 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 >>>>> 11 KSP Residual norm 6.643320423193e-03 >>>>> 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 >>>>> 12 KSP Residual norm 7.235443182108e-03 >>>>> 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 >>>>> 13 KSP Residual norm 4.971292307201e-03 >>>>> 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 >>>>> 14 KSP Residual norm 5.357933842147e-03 >>>>> 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 >>>>> 15 KSP Residual norm 5.841682994497e-03 >>>>> 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 >>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>> Cheers, >>>>> Elena >>>>> >>>>> From: Barry Smith > >>>>> Sent: 29 September 2025 20:31:26 >>>>> To: Moral Sanchez, Elena >>>>> Cc: Mark Adams; petsc-users >>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>> >>>>> >>>>> Thanks. I missed something earlier in the KSPView >>>>> >>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>> >>>>> Please add the options >>>>> >>>>>>>>> -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual >>>>> >>>>> It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. >>>>> >>>>> Barry >>>>> >>>>> >>>>>> On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: >>>>>> >>>>>> This is the output: >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 2.249726733143e+00 >>>>>> 1 KSP Residual norm 1.433120400946e+00 >>>>>> 2 KSP Residual norm 1.169262560123e+00 >>>>>> 3 KSP Residual norm 1.323528716607e+00 >>>>>> 4 KSP Residual norm 5.006323254234e-01 >>>>>> 5 KSP Residual norm 3.569836784785e-01 >>>>>> 6 KSP Residual norm 2.493182937513e-01 >>>>>> 7 KSP Residual norm 3.038202502298e-01 >>>>>> 8 KSP Residual norm 2.780214194402e-01 >>>>>> 9 KSP Residual norm 1.676826341491e-01 >>>>>> 10 KSP Residual norm 1.209985378713e-01 >>>>>> 11 KSP Residual norm 9.445076689969e-02 >>>>>> 12 KSP Residual norm 8.308555284580e-02 >>>>>> 13 KSP Residual norm 5.472865592585e-02 >>>>>> 14 KSP Residual norm 4.357870564398e-02 >>>>>> 15 KSP Residual norm 5.079681292439e-02 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 5.079681292439e-02 >>>>>> 1 KSP Residual norm 2.934938644003e-02 >>>>>> 2 KSP Residual norm 3.257065831294e-02 >>>>>> 3 KSP Residual norm 4.143063876867e-02 >>>>>> 4 KSP Residual norm 4.822471409489e-02 >>>>>> 5 KSP Residual norm 3.197538246153e-02 >>>>>> 6 KSP Residual norm 3.461217019835e-02 >>>>>> 7 KSP Residual norm 3.410193775327e-02 >>>>>> 8 KSP Residual norm 4.690424294464e-02 >>>>>> 9 KSP Residual norm 3.366148892800e-02 >>>>>> 10 KSP Residual norm 4.068015727689e-02 >>>>>> 11 KSP Residual norm 2.658836123104e-02 >>>>>> 12 KSP Residual norm 2.826244186003e-02 >>>>>> 13 KSP Residual norm 2.981793619508e-02 >>>>>> 14 KSP Residual norm 3.525455091450e-02 >>>>>> 15 KSP Residual norm 2.331539121838e-02 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 2.421498365806e-02 >>>>>> 1 KSP Residual norm 1.761072112362e-02 >>>>>> 2 KSP Residual norm 1.400842489042e-02 >>>>>> 3 KSP Residual norm 1.419665483348e-02 >>>>>> 4 KSP Residual norm 1.617590701667e-02 >>>>>> 5 KSP Residual norm 1.354824081005e-02 >>>>>> 6 KSP Residual norm 1.387252917475e-02 >>>>>> 7 KSP Residual norm 1.514043102087e-02 >>>>>> 8 KSP Residual norm 1.275811124745e-02 >>>>>> 9 KSP Residual norm 1.241039155981e-02 >>>>>> 10 KSP Residual norm 9.585207801652e-03 >>>>>> 11 KSP Residual norm 9.022641230732e-03 >>>>>> 12 KSP Residual norm 1.187709152046e-02 >>>>>> 13 KSP Residual norm 1.084880112494e-02 >>>>>> 14 KSP Residual norm 8.194750346781e-03 >>>>>> 15 KSP Residual norm 7.614246199165e-03 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 7.614246199165e-03 >>>>>> 1 KSP Residual norm 5.620014684145e-03 >>>>>> 2 KSP Residual norm 6.643368363907e-03 >>>>>> 3 KSP Residual norm 8.708642393659e-03 >>>>>> 4 KSP Residual norm 6.401852907459e-03 >>>>>> 5 KSP Residual norm 7.230576215262e-03 >>>>>> 6 KSP Residual norm 6.204081601285e-03 >>>>>> 7 KSP Residual norm 7.038656665944e-03 >>>>>> 8 KSP Residual norm 7.194079694050e-03 >>>>>> 9 KSP Residual norm 6.353576889135e-03 >>>>>> 10 KSP Residual norm 7.313589502731e-03 >>>>>> 11 KSP Residual norm 6.643320423193e-03 >>>>>> 12 KSP Residual norm 7.235443182108e-03 >>>>>> 13 KSP Residual norm 4.971292307201e-03 >>>>>> 14 KSP Residual norm 5.357933842147e-03 >>>>>> 15 KSP Residual norm 5.841682994497e-03 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> >>>>>> >>>>>> From: Barry Smith > >>>>>> Sent: 29 September 2025 15:56:33 >>>>>> To: Moral Sanchez, Elena >>>>>> Cc: Mark Adams; petsc-users >>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>> >>>>>> >>>>>> I asked you to run with >>>>>> >>>>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>>> >>>>>> you chose not to, delaying the process of understanding what is happening. >>>>>> >>>>>> Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>>> On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: >>>>>>> >>>>>>> Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. >>>>>>> >>>>>>> I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. >>>>>>> >>>>>>> From: Mark Adams > >>>>>>> Sent: 29 September 2025 14:20:56 >>>>>>> To: Moral Sanchez, Elena >>>>>>> Cc: Barry Smith; petsc-users >>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>> >>>>>>> Oh I see the coarse grid solver in your full solver output now. >>>>>>> You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. >>>>>>> >>>>>>> On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: >>>>>>>> Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: >>>>>>>> CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s >>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s >>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s >>>>>>>> ConvergedReason MG lvl -1: 3 >>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s >>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s >>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s >>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>> CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s >>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s >>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s >>>>>>>> ConvergedReason MG lvl -1: 3 >>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s >>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s >>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s >>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>> CG ConvergedReason: -3 >>>>>>>> >>>>>>>> For completeness, I add here the -ksp_view of the whole solver: >>>>>>>> KSP Object: 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=1, nonzero initial guess >>>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: 1 MPI process >>>>>>>> type: mg >>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>> Cycles per PCApply=1 >>>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=15, nonzero initial guess >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>> type: none >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=524, cols=524 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>> type: cg >>>>>>>> variant HERMITIAN >>>>>>>> maximum iterations=15, nonzero initial guess >>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>> left preconditioning >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>> type: none >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=884, cols=884 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI process >>>>>>>> type: python >>>>>>>> rows=884, cols=884 >>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>> >>>>>>>> Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. >>>>>>>> Thanks for the help, >>>>>>>> Elena >>>>>>>> >>>>>>>> From: Mark Adams > >>>>>>>> Sent: 28 September 2025 20:13:54 >>>>>>>> To: Barry Smith >>>>>>>> Cc: Moral Sanchez, Elena; petsc-users >>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>> >>>>>>>> Not sure why your "whole"solver does not have a coarse grid but this is wrong: >>>>>>>> >>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> >>>>>>>>> The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) >>>>>>>> >>>>>>>> On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: >>>>>>>>> Looks reasonable. Send the output running with >>>>>>>>> >>>>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>>>>>> >>>>>>>>>> On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: >>>>>>>>>> >>>>>>>>>> Dear Barry, >>>>>>>>>> >>>>>>>>>> This is -ksp_view for the smoother at the finest level: >>>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>>> type: cg >>>>>>>>>> variant HERMITIAN >>>>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>> left preconditioning >>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>>> type: none >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: python >>>>>>>>>> rows=524, cols=524 >>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>> And at the coarsest level: >>>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>>> type: cg >>>>>>>>>> variant HERMITIAN >>>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>> left preconditioning >>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>>> type: none >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: python >>>>>>>>>> rows=344, cols=344 >>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>> And for the whole solver: >>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>> type: cg >>>>>>>>>> variant HERMITIAN >>>>>>>>>> maximum iterations=100, nonzero initial guess >>>>>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>> type: mg >>>>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>>>> Cycles per PCApply=1 >>>>>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>>> type: cg >>>>>>>>>> variant HERMITIAN >>>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>> left preconditioning >>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>>> type: none >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: python >>>>>>>>>> rows=344, cols=344 >>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>>> type: cg >>>>>>>>>> variant HERMITIAN >>>>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>> left preconditioning >>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>>> type: none >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: python >>>>>>>>>> rows=524, cols=524 >>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>> type: python >>>>>>>>>> rows=524, cols=524 >>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>> Best, >>>>>>>>>> Elena >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> From: Barry Smith > >>>>>>>>>> Sent: 26 September 2025 19:05:02 >>>>>>>>>> To: Moral Sanchez, Elena >>>>>>>>>> Cc: petsc-users at mcs.anl.gov >>>>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Send the output using -ksp_view >>>>>>>>>> >>>>>>>>>> Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. >>>>>>>>>>> >>>>>>>>>>> What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. >>>>>>>>>>> >>>>>>>>>>> I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. >>>>>>>>>>> >>>>>>>>>>> Do you know how to make the smoother at the finest level stop when reaching the tolerance? >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Elena. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Elena.Moral.Sanchez at ipp.mpg.de Tue Oct 14 11:45:44 2025 From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena) Date: Tue, 14 Oct 2025 16:45:44 +0000 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de>, <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> Message-ID: <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> Unfortunately it did not print anything. Maybe something else is missing? Elena ________________________________ From: Barry Smith Sent: 14 October 2025 17:33:42 To: Moral Sanchez, Elena Cc: petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Sorry, I was not clear, use the options -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned don't use the KSPSetConvergenceTest(). Barry On Oct 14, 2025, at 10:03?AM, Moral Sanchez, Elena wrote: Dear Barry, Sorry for the delay in my answer. I am using petsc4py. I tried to find an equivalent KSP method to replicate your line: PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); I tried to use KSP.setConvergenceTest ( https://urldefense.us/v3/__https://petsc.org/main/petsc4py/reference/petsc4py.PETSc.KSP.html*petsc4py.PETSc.KSP.setConvergenceTest__;Iw!!G_uCfscf7eWS!fGYp2C8AXd8R-o5y0VxH8O9NWrITZ01gW2020m422eI83rx2d-KAHjPnL_mOxjp3w6a7NTAAPW_romt6eQsC35r0T-_np009KNXy$ ). The first argument must be a callable but I am uncertain at what to pass to replicate KSPConvergedSkip. Cheers, Elena ________________________________ From: Barry Smith > Sent: 10 October 2025 20:16:33 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Elana, Were you able to try the options below? Thanks for reporting the problem, since this is a problem others will face I have attempted to update/fix the PETSc code to make it absolutely clear when no convergence testing is done with https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8777__;!!G_uCfscf7eWS!fGYp2C8AXd8R-o5y0VxH8O9NWrITZ01gW2020m422eI83rx2d-KAHjPnL_mOxjp3w6a7NTAAPW_romt6eQsC35r0T-_npzdcQwIT$ Barry On Oct 7, 2025, at 10:53?AM, Barry Smith > wrote: I have to apologize again. What you are doing is so out of the ordinary (but there is nothing wrong with you doing it) that I totally lost this line of code PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); Please try the following, add the options -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned Barry On Oct 7, 2025, at 4:12?AM, Moral Sanchez, Elena > wrote: The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. Elena ________________________________ From: Mark Adams > Sent: 01 October 2025 13:25:14 To: Barry Smith Cc: Moral Sanchez, Elena; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. Barry You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: This is what I get: Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.249726733143e+00 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.433120400946e+00 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 2 KSP Residual norm 1.169262560123e+00 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 3 KSP Residual norm 1.323528716607e+00 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 4 KSP Residual norm 5.006323254234e-01 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 5 KSP Residual norm 3.569836784785e-01 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 6 KSP Residual norm 2.493182937513e-01 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 7 KSP Residual norm 3.038202502298e-01 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 8 KSP Residual norm 2.780214194402e-01 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 9 KSP Residual norm 1.676826341491e-01 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 10 KSP Residual norm 1.209985378713e-01 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 11 KSP Residual norm 9.445076689969e-02 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 12 KSP Residual norm 8.308555284580e-02 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 13 KSP Residual norm 5.472865592585e-02 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 14 KSP Residual norm 4.357870564398e-02 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 15 KSP Residual norm 5.079681292439e-02 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 5.079681292439e-02 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 1 KSP Residual norm 2.934938644003e-02 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 2 KSP Residual norm 3.257065831294e-02 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 3 KSP Residual norm 4.143063876867e-02 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 4 KSP Residual norm 4.822471409489e-02 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 5 KSP Residual norm 3.197538246153e-02 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 6 KSP Residual norm 3.461217019835e-02 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 7 KSP Residual norm 3.410193775327e-02 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 8 KSP Residual norm 4.690424294464e-02 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 9 KSP Residual norm 3.366148892800e-02 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 10 KSP Residual norm 4.068015727689e-02 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 11 KSP Residual norm 2.658836123104e-02 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 12 KSP Residual norm 2.826244186003e-02 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 13 KSP Residual norm 2.981793619508e-02 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 14 KSP Residual norm 3.525455091450e-02 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 15 KSP Residual norm 2.331539121838e-02 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.421498365806e-02 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.761072112362e-02 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 2 KSP Residual norm 1.400842489042e-02 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 3 KSP Residual norm 1.419665483348e-02 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 4 KSP Residual norm 1.617590701667e-02 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 5 KSP Residual norm 1.354824081005e-02 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 6 KSP Residual norm 1.387252917475e-02 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 7 KSP Residual norm 1.514043102087e-02 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 8 KSP Residual norm 1.275811124745e-02 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 9 KSP Residual norm 1.241039155981e-02 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 10 KSP Residual norm 9.585207801652e-03 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 11 KSP Residual norm 9.022641230732e-03 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 12 KSP Residual norm 1.187709152046e-02 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 13 KSP Residual norm 1.084880112494e-02 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 14 KSP Residual norm 8.194750346781e-03 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 15 KSP Residual norm 7.614246199165e-03 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 7.614246199165e-03 Residual norms for mg_levels_1_ solve. 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 1 KSP Residual norm 5.620014684145e-03 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 2 KSP Residual norm 6.643368363907e-03 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 3 KSP Residual norm 8.708642393659e-03 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 4 KSP Residual norm 6.401852907459e-03 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 5 KSP Residual norm 7.230576215262e-03 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 6 KSP Residual norm 6.204081601285e-03 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 7 KSP Residual norm 7.038656665944e-03 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 8 KSP Residual norm 7.194079694050e-03 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 9 KSP Residual norm 6.353576889135e-03 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 10 KSP Residual norm 7.313589502731e-03 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 11 KSP Residual norm 6.643320423193e-03 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 12 KSP Residual norm 7.235443182108e-03 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 13 KSP Residual norm 4.971292307201e-03 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 14 KSP Residual norm 5.357933842147e-03 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 15 KSP Residual norm 5.841682994497e-03 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Cheers, Elena ________________________________ From: Barry Smith > Sent: 29 September 2025 20:31:26 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Thanks. I missed something earlier in the KSPView using UNPRECONDITIONED norm type for convergence test Please add the options -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. Barry On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: This is the output: Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.249726733143e+00 1 KSP Residual norm 1.433120400946e+00 2 KSP Residual norm 1.169262560123e+00 3 KSP Residual norm 1.323528716607e+00 4 KSP Residual norm 5.006323254234e-01 5 KSP Residual norm 3.569836784785e-01 6 KSP Residual norm 2.493182937513e-01 7 KSP Residual norm 3.038202502298e-01 8 KSP Residual norm 2.780214194402e-01 9 KSP Residual norm 1.676826341491e-01 10 KSP Residual norm 1.209985378713e-01 11 KSP Residual norm 9.445076689969e-02 12 KSP Residual norm 8.308555284580e-02 13 KSP Residual norm 5.472865592585e-02 14 KSP Residual norm 4.357870564398e-02 15 KSP Residual norm 5.079681292439e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 5.079681292439e-02 1 KSP Residual norm 2.934938644003e-02 2 KSP Residual norm 3.257065831294e-02 3 KSP Residual norm 4.143063876867e-02 4 KSP Residual norm 4.822471409489e-02 5 KSP Residual norm 3.197538246153e-02 6 KSP Residual norm 3.461217019835e-02 7 KSP Residual norm 3.410193775327e-02 8 KSP Residual norm 4.690424294464e-02 9 KSP Residual norm 3.366148892800e-02 10 KSP Residual norm 4.068015727689e-02 11 KSP Residual norm 2.658836123104e-02 12 KSP Residual norm 2.826244186003e-02 13 KSP Residual norm 2.981793619508e-02 14 KSP Residual norm 3.525455091450e-02 15 KSP Residual norm 2.331539121838e-02 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 2.421498365806e-02 1 KSP Residual norm 1.761072112362e-02 2 KSP Residual norm 1.400842489042e-02 3 KSP Residual norm 1.419665483348e-02 4 KSP Residual norm 1.617590701667e-02 5 KSP Residual norm 1.354824081005e-02 6 KSP Residual norm 1.387252917475e-02 7 KSP Residual norm 1.514043102087e-02 8 KSP Residual norm 1.275811124745e-02 9 KSP Residual norm 1.241039155981e-02 10 KSP Residual norm 9.585207801652e-03 11 KSP Residual norm 9.022641230732e-03 12 KSP Residual norm 1.187709152046e-02 13 KSP Residual norm 1.084880112494e-02 14 KSP Residual norm 8.194750346781e-03 15 KSP Residual norm 7.614246199165e-03 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 Residual norms for mg_levels_1_ solve. 0 KSP Residual norm 7.614246199165e-03 1 KSP Residual norm 5.620014684145e-03 2 KSP Residual norm 6.643368363907e-03 3 KSP Residual norm 8.708642393659e-03 4 KSP Residual norm 6.401852907459e-03 5 KSP Residual norm 7.230576215262e-03 6 KSP Residual norm 6.204081601285e-03 7 KSP Residual norm 7.038656665944e-03 8 KSP Residual norm 7.194079694050e-03 9 KSP Residual norm 6.353576889135e-03 10 KSP Residual norm 7.313589502731e-03 11 KSP Residual norm 6.643320423193e-03 12 KSP Residual norm 7.235443182108e-03 13 KSP Residual norm 4.971292307201e-03 14 KSP Residual norm 5.357933842147e-03 15 KSP Residual norm 5.841682994497e-03 Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 ________________________________ From: Barry Smith > Sent: 29 September 2025 15:56:33 To: Moral Sanchez, Elena Cc: Mark Adams; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level I asked you to run with -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason you chose not to, delaying the process of understanding what is happening. Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. Barry On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. ________________________________ From: Mark Adams > Sent: 29 September 2025 14:20:56 To: Moral Sanchez, Elena Cc: Barry Smith; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Oh I see the coarse grid solver in your full solver output now. You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 4 CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s ConvergedReason MG lvl 0: 4 CG ConvergedReason: -3 For completeness, I add here the -ksp_view of the whole solver: KSP Object: 1 MPI process type: cg variant HERMITIAN maximum iterations=1, nonzero initial guess tolerances: relative=1e-08, absolute=1e-09, divergence=10000. left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: mg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level 0 ------------------------------- KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=15, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=15, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=884, cols=884 Python: Solver_petsc.LeastSquaresOperator Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=884, cols=884 Python: Solver_petsc.LeastSquaresOperator Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. Thanks for the help, Elena ________________________________ From: Mark Adams > Sent: 28 September 2025 20:13:54 To: Barry Smith Cc: Moral Sanchez, Elena; petsc-users Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Not sure why your "whole"solver does not have a coarse grid but this is wrong: KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: Looks reasonable. Send the output running with -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: Dear Barry, This is -ksp_view for the smoother at the finest level: KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=10, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator And at the coarsest level: KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=344, cols=344 Python: Solver_petsc.LeastSquaresOperator And for the whole solver: KSP Object: 1 MPI process type: cg variant HERMITIAN maximum iterations=100, nonzero initial guess tolerances: relative=1e-08, absolute=1e-09, divergence=10000. left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: mg type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level 0 ------------------------------- KSP Object: (mg_coarse_) 1 MPI process type: cg variant HERMITIAN maximum iterations=100, initial guess is zero tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_coarse_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=344, cols=344 Python: Solver_petsc.LeastSquaresOperator Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI process type: cg variant HERMITIAN maximum iterations=10, nonzero initial guess tolerances: relative=0.1, absolute=0.1, divergence=1e+30 left preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (mg_levels_1_) 1 MPI process type: none linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1 MPI process type: python rows=524, cols=524 Python: Solver_petsc.LeastSquaresOperator Best, Elena ________________________________ From: Barry Smith > Sent: 26 September 2025 19:05:02 To: Moral Sanchez, Elena Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level Send the output using -ksp_view Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. Barry On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: Hi, I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. Do you know how to make the smoother at the finest level stop when reaching the tolerance? Cheers, Elena. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Oct 14 12:19:25 2025 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 14 Oct 2025 13:19:25 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> Message-ID: <5687B985-6FB4-4DE0-9151-AEE034171185@petsc.dev> It shouldn't print anything but it should change the behavior of the convergence test so that it behaves as you expected (that is the convergence test is used on the levels). Make sure to call KSPSetFromOptions() just before KSPSolve() and after KSPSetUp(). > On Oct 14, 2025, at 12:45?PM, Moral Sanchez, Elena wrote: > > Unfortunately it did not print anything. Maybe something else is missing? > > Elena > From: Barry Smith > > Sent: 14 October 2025 17:33:42 > To: Moral Sanchez, Elena > Cc: petsc-users > Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level > > > Sorry, I was not clear, use the options > >> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned > > don't use the KSPSetConvergenceTest(). > > Barry > > >> On Oct 14, 2025, at 10:03?AM, Moral Sanchez, Elena > wrote: >> >> Dear Barry, >> >> Sorry for the delay in my answer. >> >> I am using petsc4py. I tried to find an equivalent KSP method to replicate your line: >> >> PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); >> >> I tried to use KSP.setConvergenceTest ( https://urldefense.us/v3/__https://petsc.org/main/petsc4py/reference/petsc4py.PETSc.KSP.html*petsc4py.PETSc.KSP.setConvergenceTest__;Iw!!G_uCfscf7eWS!e0QWzuvCZMiCBVF4XRgmz92m8x9MX08mhe6x3UX7secgkU3um8_FtasngEunQz_GsBUvyr5ultwDOeOvHYkZJA4$ ). The first argument must be a callable but I am uncertain at what to pass to replicate KSPConvergedSkip. >> >> Cheers, >> Elena >> >> From: Barry Smith > >> Sent: 10 October 2025 20:16:33 >> To: Moral Sanchez, Elena >> Cc: Mark Adams; petsc-users >> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >> >> >> Elana, >> >> Were you able to try the options below? >> >> Thanks for reporting the problem, since this is a problem others will face I have attempted to update/fix the PETSc code to make it absolutely clear when no convergence testing is done with https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8777__;!!G_uCfscf7eWS!e0QWzuvCZMiCBVF4XRgmz92m8x9MX08mhe6x3UX7secgkU3um8_FtasngEunQz_GsBUvyr5ultwDOeOvYlzVOGY$ >> >> Barry >> >> >>> On Oct 7, 2025, at 10:53?AM, Barry Smith > wrote: >>> >>> >>> I have to apologize again. What you are doing is so out of the ordinary (but there is nothing wrong with you doing it) that I totally lost this line of code >>> >>> PetscCall(KSPSetConvergenceTest(mglevels[i]->smoothd, KSPConvergedSkip, NULL, NULL)); >>> >>> Please try the following, add the options >>> >>> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned >>> >>> Barry >>> >>> >>> >>> >>> >>> >>>> On Oct 7, 2025, at 4:12?AM, Moral Sanchez, Elena > wrote: >>>> >>>> The problem is that the fine grid solver is iterating past the prescribed tolerance. It iterates until the maximum number of iterations has been achieved. >>>> >>>> Elena >>>> >>>> >>>> From: Mark Adams > >>>> Sent: 01 October 2025 13:25:14 >>>> To: Barry Smith >>>> Cc: Moral Sanchez, Elena; petsc-users >>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>> >>>> Sorry to jump in, but what is the problem here? This looks fine to me, other than the coarse grid solver that I mentioned. >>>> >>>> On Tue, Sep 30, 2025 at 9:27?AM Barry Smith > wrote: >>>>> >>>>> Would you be able to share your code? I'm at a loss as to why we are seeing this behavior and can much more quickly figure it out by running the code in a debugger. >>>>> >>>>> Barry >>>>> >>>>> You can send the code petsc-maint at mcs.anl.gov if you don't want to share the code with everyone, >>>>> >>>>>> On Sep 30, 2025, at 5:05?AM, Moral Sanchez, Elena > wrote: >>>>>> >>>>>> This is what I get: >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 2.249726733143e+00 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP unpreconditioned resid norm 2.249726733143e+00 true resid norm 2.249726733143e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 1 KSP Residual norm 1.433120400946e+00 >>>>>> 1 KSP unpreconditioned resid norm 1.433120400946e+00 true resid norm 1.433120400946e+00 ||r(i)||/||b|| 6.370197677051e-01 >>>>>> 2 KSP Residual norm 1.169262560123e+00 >>>>>> 2 KSP unpreconditioned resid norm 1.169262560123e+00 true resid norm 1.169262560123e+00 ||r(i)||/||b|| 5.197353718108e-01 >>>>>> 3 KSP Residual norm 1.323528716607e+00 >>>>>> 3 KSP unpreconditioned resid norm 1.323528716607e+00 true resid norm 1.323528716607e+00 ||r(i)||/||b|| 5.883064361148e-01 >>>>>> 4 KSP Residual norm 5.006323254234e-01 >>>>>> 4 KSP unpreconditioned resid norm 5.006323254234e-01 true resid norm 5.006323254234e-01 ||r(i)||/||b|| 2.225302824775e-01 >>>>>> 5 KSP Residual norm 3.569836784785e-01 >>>>>> 5 KSP unpreconditioned resid norm 3.569836784785e-01 true resid norm 3.569836784785e-01 ||r(i)||/||b|| 1.586786844906e-01 >>>>>> 6 KSP Residual norm 2.493182937513e-01 >>>>>> 6 KSP unpreconditioned resid norm 2.493182937513e-01 true resid norm 2.493182937513e-01 ||r(i)||/||b|| 1.108215900529e-01 >>>>>> 7 KSP Residual norm 3.038202502298e-01 >>>>>> 7 KSP unpreconditioned resid norm 3.038202502298e-01 true resid norm 3.038202502298e-01 ||r(i)||/||b|| 1.350476241198e-01 >>>>>> 8 KSP Residual norm 2.780214194402e-01 >>>>>> 8 KSP unpreconditioned resid norm 2.780214194402e-01 true resid norm 2.780214194402e-01 ||r(i)||/||b|| 1.235800843473e-01 >>>>>> 9 KSP Residual norm 1.676826341491e-01 >>>>>> 9 KSP unpreconditioned resid norm 1.676826341491e-01 true resid norm 1.676826341491e-01 ||r(i)||/||b|| 7.453466755710e-02 >>>>>> 10 KSP Residual norm 1.209985378713e-01 >>>>>> 10 KSP unpreconditioned resid norm 1.209985378713e-01 true resid norm 1.209985378713e-01 ||r(i)||/||b|| 5.378366007245e-02 >>>>>> 11 KSP Residual norm 9.445076689969e-02 >>>>>> 11 KSP unpreconditioned resid norm 9.445076689969e-02 true resid norm 9.445076689969e-02 ||r(i)||/||b|| 4.198321756516e-02 >>>>>> 12 KSP Residual norm 8.308555284580e-02 >>>>>> 12 KSP unpreconditioned resid norm 8.308555284580e-02 true resid norm 8.308555284580e-02 ||r(i)||/||b|| 3.693139776569e-02 >>>>>> 13 KSP Residual norm 5.472865592585e-02 >>>>>> 13 KSP unpreconditioned resid norm 5.472865592585e-02 true resid norm 5.472865592585e-02 ||r(i)||/||b|| 2.432680161532e-02 >>>>>> 14 KSP Residual norm 4.357870564398e-02 >>>>>> 14 KSP unpreconditioned resid norm 4.357870564398e-02 true resid norm 4.357870564398e-02 ||r(i)||/||b|| 1.937066622447e-02 >>>>>> 15 KSP Residual norm 5.079681292439e-02 >>>>>> 15 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357558e-02 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 5.079681292439e-02 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP unpreconditioned resid norm 5.079681292439e-02 true resid norm 5.079681292439e-02 ||r(i)||/||b|| 2.257910357559e-02 >>>>>> 1 KSP Residual norm 2.934938644003e-02 >>>>>> 1 KSP unpreconditioned resid norm 2.934938644003e-02 true resid norm 2.934938644003e-02 ||r(i)||/||b|| 1.304575618348e-02 >>>>>> 2 KSP Residual norm 3.257065831294e-02 >>>>>> 2 KSP unpreconditioned resid norm 3.257065831294e-02 true resid norm 3.257065831294e-02 ||r(i)||/||b|| 1.447760647243e-02 >>>>>> 3 KSP Residual norm 4.143063876867e-02 >>>>>> 3 KSP unpreconditioned resid norm 4.143063876867e-02 true resid norm 4.143063876867e-02 ||r(i)||/||b|| 1.841585387164e-02 >>>>>> 4 KSP Residual norm 4.822471409489e-02 >>>>>> 4 KSP unpreconditioned resid norm 4.822471409489e-02 true resid norm 4.822471409489e-02 ||r(i)||/||b|| 2.143580968499e-02 >>>>>> 5 KSP Residual norm 3.197538246153e-02 >>>>>> 5 KSP unpreconditioned resid norm 3.197538246153e-02 true resid norm 3.197538246153e-02 ||r(i)||/||b|| 1.421300729127e-02 >>>>>> 6 KSP Residual norm 3.461217019835e-02 >>>>>> 6 KSP unpreconditioned resid norm 3.461217019835e-02 true resid norm 3.461217019835e-02 ||r(i)||/||b|| 1.538505529958e-02 >>>>>> 7 KSP Residual norm 3.410193775327e-02 >>>>>> 7 KSP unpreconditioned resid norm 3.410193775327e-02 true resid norm 3.410193775327e-02 ||r(i)||/||b|| 1.515825777899e-02 >>>>>> 8 KSP Residual norm 4.690424294464e-02 >>>>>> 8 KSP unpreconditioned resid norm 4.690424294464e-02 true resid norm 4.690424294464e-02 ||r(i)||/||b|| 2.084886233233e-02 >>>>>> 9 KSP Residual norm 3.366148892800e-02 >>>>>> 9 KSP unpreconditioned resid norm 3.366148892800e-02 true resid norm 3.366148892800e-02 ||r(i)||/||b|| 1.496247896783e-02 >>>>>> 10 KSP Residual norm 4.068015727689e-02 >>>>>> 10 KSP unpreconditioned resid norm 4.068015727689e-02 true resid norm 4.068015727689e-02 ||r(i)||/||b|| 1.808226602707e-02 >>>>>> 11 KSP Residual norm 2.658836123104e-02 >>>>>> 11 KSP unpreconditioned resid norm 2.658836123104e-02 true resid norm 2.658836123104e-02 ||r(i)||/||b|| 1.181848481389e-02 >>>>>> 12 KSP Residual norm 2.826244186003e-02 >>>>>> 12 KSP unpreconditioned resid norm 2.826244186003e-02 true resid norm 2.826244186003e-02 ||r(i)||/||b|| 1.256261102456e-02 >>>>>> 13 KSP Residual norm 2.981793619508e-02 >>>>>> 13 KSP unpreconditioned resid norm 2.981793619508e-02 true resid norm 2.981793619508e-02 ||r(i)||/||b|| 1.325402581380e-02 >>>>>> 14 KSP Residual norm 3.525455091450e-02 >>>>>> 14 KSP unpreconditioned resid norm 3.525455091450e-02 true resid norm 3.525455091450e-02 ||r(i)||/||b|| 1.567059251914e-02 >>>>>> 15 KSP Residual norm 2.331539121838e-02 >>>>>> 15 KSP unpreconditioned resid norm 2.331539121838e-02 true resid norm 2.331539121838e-02 ||r(i)||/||b|| 1.036365478300e-02 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 2.421498365806e-02 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP unpreconditioned resid norm 2.421498365806e-02 true resid norm 2.421498365806e-02 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 1 KSP Residual norm 1.761072112362e-02 >>>>>> 1 KSP unpreconditioned resid norm 1.761072112362e-02 true resid norm 1.761072112362e-02 ||r(i)||/||b|| 7.272654556492e-01 >>>>>> 2 KSP Residual norm 1.400842489042e-02 >>>>>> 2 KSP unpreconditioned resid norm 1.400842489042e-02 true resid norm 1.400842489042e-02 ||r(i)||/||b|| 5.785023474818e-01 >>>>>> 3 KSP Residual norm 1.419665483348e-02 >>>>>> 3 KSP unpreconditioned resid norm 1.419665483348e-02 true resid norm 1.419665483348e-02 ||r(i)||/||b|| 5.862756314004e-01 >>>>>> 4 KSP Residual norm 1.617590701667e-02 >>>>>> 4 KSP unpreconditioned resid norm 1.617590701667e-02 true resid norm 1.617590701667e-02 ||r(i)||/||b|| 6.680123036665e-01 >>>>>> 5 KSP Residual norm 1.354824081005e-02 >>>>>> 5 KSP unpreconditioned resid norm 1.354824081005e-02 true resid norm 1.354824081005e-02 ||r(i)||/||b|| 5.594982429624e-01 >>>>>> 6 KSP Residual norm 1.387252917475e-02 >>>>>> 6 KSP unpreconditioned resid norm 1.387252917475e-02 true resid norm 1.387252917475e-02 ||r(i)||/||b|| 5.728902967950e-01 >>>>>> 7 KSP Residual norm 1.514043102087e-02 >>>>>> 7 KSP unpreconditioned resid norm 1.514043102087e-02 true resid norm 1.514043102087e-02 ||r(i)||/||b|| 6.252505157414e-01 >>>>>> 8 KSP Residual norm 1.275811124745e-02 >>>>>> 8 KSP unpreconditioned resid norm 1.275811124745e-02 true resid norm 1.275811124745e-02 ||r(i)||/||b|| 5.268684640721e-01 >>>>>> 9 KSP Residual norm 1.241039155981e-02 >>>>>> 9 KSP unpreconditioned resid norm 1.241039155981e-02 true resid norm 1.241039155981e-02 ||r(i)||/||b|| 5.125087728764e-01 >>>>>> 10 KSP Residual norm 9.585207801652e-03 >>>>>> 10 KSP unpreconditioned resid norm 9.585207801652e-03 true resid norm 9.585207801652e-03 ||r(i)||/||b|| 3.958378802565e-01 >>>>>> 11 KSP Residual norm 9.022641230732e-03 >>>>>> 11 KSP unpreconditioned resid norm 9.022641230732e-03 true resid norm 9.022641230732e-03 ||r(i)||/||b|| 3.726057121550e-01 >>>>>> 12 KSP Residual norm 1.187709152046e-02 >>>>>> 12 KSP unpreconditioned resid norm 1.187709152046e-02 true resid norm 1.187709152046e-02 ||r(i)||/||b|| 4.904852172597e-01 >>>>>> 13 KSP Residual norm 1.084880112494e-02 >>>>>> 13 KSP unpreconditioned resid norm 1.084880112494e-02 true resid norm 1.084880112494e-02 ||r(i)||/||b|| 4.480201712351e-01 >>>>>> 14 KSP Residual norm 8.194750346781e-03 >>>>>> 14 KSP unpreconditioned resid norm 8.194750346781e-03 true resid norm 8.194750346781e-03 ||r(i)||/||b|| 3.384165136140e-01 >>>>>> 15 KSP Residual norm 7.614246199165e-03 >>>>>> 15 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP Residual norm 7.614246199165e-03 >>>>>> Residual norms for mg_levels_1_ solve. >>>>>> 0 KSP unpreconditioned resid norm 7.614246199165e-03 true resid norm 7.614246199165e-03 ||r(i)||/||b|| 3.144435819857e-01 >>>>>> 1 KSP Residual norm 5.620014684145e-03 >>>>>> 1 KSP unpreconditioned resid norm 5.620014684145e-03 true resid norm 5.620014684145e-03 ||r(i)||/||b|| 2.320883120759e-01 >>>>>> 2 KSP Residual norm 6.643368363907e-03 >>>>>> 2 KSP unpreconditioned resid norm 6.643368363907e-03 true resid norm 6.643368363907e-03 ||r(i)||/||b|| 2.743494878096e-01 >>>>>> 3 KSP Residual norm 8.708642393659e-03 >>>>>> 3 KSP unpreconditioned resid norm 8.708642393659e-03 true resid norm 8.708642393659e-03 ||r(i)||/||b|| 3.596385823189e-01 >>>>>> 4 KSP Residual norm 6.401852907459e-03 >>>>>> 4 KSP unpreconditioned resid norm 6.401852907459e-03 true resid norm 6.401852907459e-03 ||r(i)||/||b|| 2.643756856440e-01 >>>>>> 5 KSP Residual norm 7.230576215262e-03 >>>>>> 5 KSP unpreconditioned resid norm 7.230576215262e-03 true resid norm 7.230576215262e-03 ||r(i)||/||b|| 2.985992605803e-01 >>>>>> 6 KSP Residual norm 6.204081601285e-03 >>>>>> 6 KSP unpreconditioned resid norm 6.204081601285e-03 true resid norm 6.204081601285e-03 ||r(i)||/||b|| 2.562083744880e-01 >>>>>> 7 KSP Residual norm 7.038656665944e-03 >>>>>> 7 KSP unpreconditioned resid norm 7.038656665944e-03 true resid norm 7.038656665944e-03 ||r(i)||/||b|| 2.906736079337e-01 >>>>>> 8 KSP Residual norm 7.194079694050e-03 >>>>>> 8 KSP unpreconditioned resid norm 7.194079694050e-03 true resid norm 7.194079694050e-03 ||r(i)||/||b|| 2.970920730585e-01 >>>>>> 9 KSP Residual norm 6.353576889135e-03 >>>>>> 9 KSP unpreconditioned resid norm 6.353576889135e-03 true resid norm 6.353576889135e-03 ||r(i)||/||b|| 2.623820432363e-01 >>>>>> 10 KSP Residual norm 7.313589502731e-03 >>>>>> 10 KSP unpreconditioned resid norm 7.313589502731e-03 true resid norm 7.313589502731e-03 ||r(i)||/||b|| 3.020274391264e-01 >>>>>> 11 KSP Residual norm 6.643320423193e-03 >>>>>> 11 KSP unpreconditioned resid norm 6.643320423193e-03 true resid norm 6.643320423193e-03 ||r(i)||/||b|| 2.743475080142e-01 >>>>>> 12 KSP Residual norm 7.235443182108e-03 >>>>>> 12 KSP unpreconditioned resid norm 7.235443182108e-03 true resid norm 7.235443182108e-03 ||r(i)||/||b|| 2.988002504681e-01 >>>>>> 13 KSP Residual norm 4.971292307201e-03 >>>>>> 13 KSP unpreconditioned resid norm 4.971292307201e-03 true resid norm 4.971292307201e-03 ||r(i)||/||b|| 2.052981896416e-01 >>>>>> 14 KSP Residual norm 5.357933842147e-03 >>>>>> 14 KSP unpreconditioned resid norm 5.357933842147e-03 true resid norm 5.357933842147e-03 ||r(i)||/||b|| 2.212652264320e-01 >>>>>> 15 KSP Residual norm 5.841682994497e-03 >>>>>> 15 KSP unpreconditioned resid norm 5.841682994497e-03 true resid norm 5.841682994497e-03 ||r(i)||/||b|| 2.412424917146e-01 >>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>> Cheers, >>>>>> Elena >>>>>> >>>>>> From: Barry Smith > >>>>>> Sent: 29 September 2025 20:31:26 >>>>>> To: Moral Sanchez, Elena >>>>>> Cc: Mark Adams; petsc-users >>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>> >>>>>> >>>>>> Thanks. I missed something earlier in the KSPView >>>>>> >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>> >>>>>> Please add the options >>>>>> >>>>>>>>>> -ksp_monitor_true_residual -mg_levels_ksp_monitor_true_residual >>>>>> >>>>>> It is using the unpreconditioned residual norms for convergence testing but we are printing the preconditioned norms. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>>> On Sep 29, 2025, at 11:12?AM, Moral Sanchez, Elena > wrote: >>>>>>> >>>>>>> This is the output: >>>>>>> Residual norms for mg_levels_1_ solve. >>>>>>> 0 KSP Residual norm 2.249726733143e+00 >>>>>>> 1 KSP Residual norm 1.433120400946e+00 >>>>>>> 2 KSP Residual norm 1.169262560123e+00 >>>>>>> 3 KSP Residual norm 1.323528716607e+00 >>>>>>> 4 KSP Residual norm 5.006323254234e-01 >>>>>>> 5 KSP Residual norm 3.569836784785e-01 >>>>>>> 6 KSP Residual norm 2.493182937513e-01 >>>>>>> 7 KSP Residual norm 3.038202502298e-01 >>>>>>> 8 KSP Residual norm 2.780214194402e-01 >>>>>>> 9 KSP Residual norm 1.676826341491e-01 >>>>>>> 10 KSP Residual norm 1.209985378713e-01 >>>>>>> 11 KSP Residual norm 9.445076689969e-02 >>>>>>> 12 KSP Residual norm 8.308555284580e-02 >>>>>>> 13 KSP Residual norm 5.472865592585e-02 >>>>>>> 14 KSP Residual norm 4.357870564398e-02 >>>>>>> 15 KSP Residual norm 5.079681292439e-02 >>>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>>> Residual norms for mg_levels_1_ solve. >>>>>>> 0 KSP Residual norm 5.079681292439e-02 >>>>>>> 1 KSP Residual norm 2.934938644003e-02 >>>>>>> 2 KSP Residual norm 3.257065831294e-02 >>>>>>> 3 KSP Residual norm 4.143063876867e-02 >>>>>>> 4 KSP Residual norm 4.822471409489e-02 >>>>>>> 5 KSP Residual norm 3.197538246153e-02 >>>>>>> 6 KSP Residual norm 3.461217019835e-02 >>>>>>> 7 KSP Residual norm 3.410193775327e-02 >>>>>>> 8 KSP Residual norm 4.690424294464e-02 >>>>>>> 9 KSP Residual norm 3.366148892800e-02 >>>>>>> 10 KSP Residual norm 4.068015727689e-02 >>>>>>> 11 KSP Residual norm 2.658836123104e-02 >>>>>>> 12 KSP Residual norm 2.826244186003e-02 >>>>>>> 13 KSP Residual norm 2.981793619508e-02 >>>>>>> 14 KSP Residual norm 3.525455091450e-02 >>>>>>> 15 KSP Residual norm 2.331539121838e-02 >>>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>>> Residual norms for mg_levels_1_ solve. >>>>>>> 0 KSP Residual norm 2.421498365806e-02 >>>>>>> 1 KSP Residual norm 1.761072112362e-02 >>>>>>> 2 KSP Residual norm 1.400842489042e-02 >>>>>>> 3 KSP Residual norm 1.419665483348e-02 >>>>>>> 4 KSP Residual norm 1.617590701667e-02 >>>>>>> 5 KSP Residual norm 1.354824081005e-02 >>>>>>> 6 KSP Residual norm 1.387252917475e-02 >>>>>>> 7 KSP Residual norm 1.514043102087e-02 >>>>>>> 8 KSP Residual norm 1.275811124745e-02 >>>>>>> 9 KSP Residual norm 1.241039155981e-02 >>>>>>> 10 KSP Residual norm 9.585207801652e-03 >>>>>>> 11 KSP Residual norm 9.022641230732e-03 >>>>>>> 12 KSP Residual norm 1.187709152046e-02 >>>>>>> 13 KSP Residual norm 1.084880112494e-02 >>>>>>> 14 KSP Residual norm 8.194750346781e-03 >>>>>>> 15 KSP Residual norm 7.614246199165e-03 >>>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>>> Residual norms for mg_levels_1_ solve. >>>>>>> 0 KSP Residual norm 7.614246199165e-03 >>>>>>> 1 KSP Residual norm 5.620014684145e-03 >>>>>>> 2 KSP Residual norm 6.643368363907e-03 >>>>>>> 3 KSP Residual norm 8.708642393659e-03 >>>>>>> 4 KSP Residual norm 6.401852907459e-03 >>>>>>> 5 KSP Residual norm 7.230576215262e-03 >>>>>>> 6 KSP Residual norm 6.204081601285e-03 >>>>>>> 7 KSP Residual norm 7.038656665944e-03 >>>>>>> 8 KSP Residual norm 7.194079694050e-03 >>>>>>> 9 KSP Residual norm 6.353576889135e-03 >>>>>>> 10 KSP Residual norm 7.313589502731e-03 >>>>>>> 11 KSP Residual norm 6.643320423193e-03 >>>>>>> 12 KSP Residual norm 7.235443182108e-03 >>>>>>> 13 KSP Residual norm 4.971292307201e-03 >>>>>>> 14 KSP Residual norm 5.357933842147e-03 >>>>>>> 15 KSP Residual norm 5.841682994497e-03 >>>>>>> Linear mg_levels_1_ solve converged due to CONVERGED_ITS iterations 15 >>>>>>> >>>>>>> >>>>>>> From: Barry Smith > >>>>>>> Sent: 29 September 2025 15:56:33 >>>>>>> To: Moral Sanchez, Elena >>>>>>> Cc: Mark Adams; petsc-users >>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>> >>>>>>> >>>>>>> I asked you to run with >>>>>>> >>>>>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>>>> >>>>>>> you chose not to, delaying the process of understanding what is happening. >>>>>>> >>>>>>> Please run with those options and send the output. My guess is that you are computing the "residual norms" in your own monitor code, and it is doing so differently than what PETSc does, thus resulting in the appearance of a sufficiently small residual norm, whereas PETSc may not have calculated something that small. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>>> On Sep 29, 2025, at 8:39?AM, Moral Sanchez, Elena > wrote: >>>>>>>> >>>>>>>> Thanks for the hint. I agree that the coarse solve should be much more "accurate". However, for the moment I am just trying to understand what the MG is doing exactly. >>>>>>>> >>>>>>>> I am puzzled to see that the fine grid smoother ("lvl 0") does not stop when the residual becomes less than 1e-1. It should converge due to the atol. >>>>>>>> >>>>>>>> From: Mark Adams > >>>>>>>> Sent: 29 September 2025 14:20:56 >>>>>>>> To: Moral Sanchez, Elena >>>>>>>> Cc: Barry Smith; petsc-users >>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>> >>>>>>>> Oh I see the coarse grid solver in your full solver output now. >>>>>>>> You still want an accurate coarse grid solve. Usually (the default in GAMG) you use a direct solver on one process, and cousin until the coarse grid is small enough to make that cheap. >>>>>>>> >>>>>>>> On Mon, Sep 29, 2025 at 8:07?AM Moral Sanchez, Elena > wrote: >>>>>>>>> Hi, I doubled the system size and changed the tolerances just to show a better example of the problem. This is the output of the callbacks in the first iteration: >>>>>>>>> CG Iter 0/1 | res = 2.25e+00/1.00e-09 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.25e+00/1.00e-01 | 0.3 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.43e+00/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.17e+00/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.32e+00/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 5.01e-01/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.57e-01/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.49e-01/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.04e-01/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.68e-01/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.21e-01/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.45e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.31e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 5.47e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 4.36e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.08e-02/1.00e-01 | 0.1 s >>>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 8.15e-02/1.00e-01 | 3.0 s >>>>>>>>> ConvergedReason MG lvl -1: 3 >>>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.08e-02/1.00e-01 | 0.3 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.93e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.26e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 4.14e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 4.82e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 3.20e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.46e-02/1.00e-01 | 0.3 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.41e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.69e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.37e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 4.07e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.66e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.83e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 2.98e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 3.53e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 2.33e-02/1.00e-01 | 0.2 s >>>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>>> CG Iter 1/1 | res = 2.42e-02/1.00e-09 | 5.6 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 2.42e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.40e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.42e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 1.62e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 1.35e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 1.39e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.51e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.28e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.24e-02/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 9.59e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 9.02e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.19e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.08e-02/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.19e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 7.61e-03/1.00e-01 | 0.1 s >>>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>>> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-02/1.00e-01 | 5.2 s >>>>>>>>> ConvergedReason MG lvl -1: 3 >>>>>>>>> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.61e-03/1.00e-01 | 0.2 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 1/15 | res = 5.62e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 2/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 3/15 | res = 8.71e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.40e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 5/15 | res = 7.23e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 6/15 | res = 6.20e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 7/15 | res = 7.04e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 8/15 | res = 7.19e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 9/15 | res = 6.35e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.31e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 11/15 | res = 6.64e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 12/15 | res = 7.24e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 13/15 | res = 4.97e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 14/15 | res = 5.36e-03/1.00e-01 | 0.1 s >>>>>>>>> MG lvl 0 (s=884): CG Iter 15/15 | res = 5.84e-03/1.00e-01 | 0.1 s >>>>>>>>> ConvergedReason MG lvl 0: 4 >>>>>>>>> CG ConvergedReason: -3 >>>>>>>>> >>>>>>>>> For completeness, I add here the -ksp_view of the whole solver: >>>>>>>>> KSP Object: 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=1, nonzero initial guess >>>>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI process >>>>>>>>> type: mg >>>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>>> Cycles per PCApply=1 >>>>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=15, nonzero initial guess >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=524, cols=524 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>> type: cg >>>>>>>>> variant HERMITIAN >>>>>>>>> maximum iterations=15, nonzero initial guess >>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>> left preconditioning >>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>> type: none >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=884, cols=884 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI process >>>>>>>>> type: python >>>>>>>>> rows=884, cols=884 >>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>> >>>>>>>>> Regarding Mark's Email: What do you mean with "the whole solver doesn't have a coarse grid"? I am using my own Restriction and Interpolation operators. >>>>>>>>> Thanks for the help, >>>>>>>>> Elena >>>>>>>>> >>>>>>>>> From: Mark Adams > >>>>>>>>> Sent: 28 September 2025 20:13:54 >>>>>>>>> To: Barry Smith >>>>>>>>> Cc: Moral Sanchez, Elena; petsc-users >>>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>>> >>>>>>>>> Not sure why your "whole"solver does not have a coarse grid but this is wrong: >>>>>>>>> >>>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>>> type: cg >>>>>>>>>> variant HERMITIAN >>>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>> >>>>>>>>>> The coarse grid has to be accurate. The defaults are a good place to start: max_it=10.000, rtol=1e-5, atol=1e-30 (ish) >>>>>>>>> >>>>>>>>> On Fri, Sep 26, 2025 at 3:21?PM Barry Smith > wrote: >>>>>>>>>> Looks reasonable. Send the output running with >>>>>>>>>> >>>>>>>>>> -ksp_monitor -mg_levels_ksp_monitor -ksp_converged_reason -mg_levels_ksp_converged_reason >>>>>>>>>> >>>>>>>>>>> On Sep 26, 2025, at 1:19?PM, Moral Sanchez, Elena > wrote: >>>>>>>>>>> >>>>>>>>>>> Dear Barry, >>>>>>>>>>> >>>>>>>>>>> This is -ksp_view for the smoother at the finest level: >>>>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>>>> type: cg >>>>>>>>>>> variant HERMITIAN >>>>>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>>> left preconditioning >>>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>>>> type: none >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: python >>>>>>>>>>> rows=524, cols=524 >>>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>>> And at the coarsest level: >>>>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>>>> type: cg >>>>>>>>>>> variant HERMITIAN >>>>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>>> left preconditioning >>>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>>>> type: none >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: python >>>>>>>>>>> rows=344, cols=344 >>>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>>> And for the whole solver: >>>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>>> type: cg >>>>>>>>>>> variant HERMITIAN >>>>>>>>>>> maximum iterations=100, nonzero initial guess >>>>>>>>>>> tolerances: relative=1e-08, absolute=1e-09, divergence=10000. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>>> type: mg >>>>>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v >>>>>>>>>>> Cycles per PCApply=1 >>>>>>>>>>> Not using Galerkin computed coarse grid matrices >>>>>>>>>>> Coarse grid solver -- level 0 ------------------------------- >>>>>>>>>>> KSP Object: (mg_coarse_) 1 MPI process >>>>>>>>>>> type: cg >>>>>>>>>>> variant HERMITIAN >>>>>>>>>>> maximum iterations=100, initial guess is zero >>>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>>> left preconditioning >>>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: (mg_coarse_) 1 MPI process >>>>>>>>>>> type: none >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: python >>>>>>>>>>> rows=344, cols=344 >>>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>>> Down solver (pre-smoother) on level 1 ------------------------------- >>>>>>>>>>> KSP Object: (mg_levels_1_) 1 MPI process >>>>>>>>>>> type: cg >>>>>>>>>>> variant HERMITIAN >>>>>>>>>>> maximum iterations=10, nonzero initial guess >>>>>>>>>>> tolerances: relative=0.1, absolute=0.1, divergence=1e+30 >>>>>>>>>>> left preconditioning >>>>>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: (mg_levels_1_) 1 MPI process >>>>>>>>>>> type: none >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: python >>>>>>>>>>> rows=524, cols=524 >>>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>>> Up solver (post-smoother) same as down solver (pre-smoother) >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: python >>>>>>>>>>> rows=524, cols=524 >>>>>>>>>>> Python: Solver_petsc.LeastSquaresOperator >>>>>>>>>>> Best, >>>>>>>>>>> Elena >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> From: Barry Smith > >>>>>>>>>>> Sent: 26 September 2025 19:05:02 >>>>>>>>>>> To: Moral Sanchez, Elena >>>>>>>>>>> Cc: petsc-users at mcs.anl.gov >>>>>>>>>>> Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Send the output using -ksp_view >>>>>>>>>>> >>>>>>>>>>> Normally one uses a fixed number of iterations of smoothing on level with multigrid rather than a tolerance, but yes PETSc should respect such a tolerance. >>>>>>>>>>> >>>>>>>>>>> Barry >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Sep 26, 2025, at 12:49?PM, Moral Sanchez, Elena > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> I am using multigrid (multiplicative) as a preconditioner with a V-cycle of two levels. At each level, I am setting CG as the smoother with certain tolerance. >>>>>>>>>>>> >>>>>>>>>>>> What I observe is that in the finest level the CG continues iterating after the residual norm reaches the tolerance (atol) and it only stops when reaching the maximum number of iterations at that level. At the coarsest level this does not occur and the CG stops when the tolerance is reached. >>>>>>>>>>>> >>>>>>>>>>>> I double-checked that the smoother at the finest level has the right tolerance. And I am using a Monitor function to track the residual. >>>>>>>>>>>> >>>>>>>>>>>> Do you know how to make the smoother at the finest level stop when reaching the tolerance? >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Elena. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Oct 15 07:23:11 2025 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 15 Oct 2025 08:23:11 -0400 Subject: [petsc-users] CALL FOR PAPERS -- PASC26 Message-ID: This an applications oriented conference, with several application tracks as well as CSE: ============================================ CALL FOR PAPERS Platform for Advanced Scientific Computing PASC26 Universit?t Bern & PHBern, Switzerland June 29 ? July 1, 2026 https://urldefense.us/v3/__https://pasc26.pasc-conference.org__;!!G_uCfscf7eWS!baAlzMcWyIXHehP17zWMs1_F3yb-SiuH-gz0o1GYhKp5v_vP-5hydledIk3ukeJyRK24MFjHAlMGc09IDhzcStU$ Deadline: 12 Dec 2025 (no extensions) ============================================= The PASC Conference series is an international and interdisciplinary platform for the exchange of knowledge in scientific computing and computational science with a strong focus on methods, tools, algorithms, workflows, application challenges, and novel techniques in the context of scientific usage of high performance computing. The Conference is co-sponsored by the Swiss National Supercomputing Centre (CSCS ? a unit of ETH Zurich) and the Association for Computing Machinery (ACM). The conference is managed by CSCS. The event will be hosted at Universit?t Bern & PHBern, in Bern, Switzerland. The guidelines for submissions are published at https://urldefense.us/v3/__https://pasc26.pasc-conference.org/submission/guidelines-for-__;!!G_uCfscf7eWS!baAlzMcWyIXHehP17zWMs1_F3yb-SiuH-gz0o1GYhKp5v_vP-5hydledIk3ukeJyRK24MFjHAlMGc09Iyf6GtIU$ papers/. The technical program of PASC26 is organized around the following scientific domains: Chemistry and Materials; Climate, Weather, and Earth Sciences; Applied Social Sciences and Humanities; Engineering; Life Sciences; Physics; Computational Methods and Applied Mathematics. PASC26 solicits high-quality contributions of original research related to scientific computing in all of these domains. Proposals that emphasize the theme of PASC26 ? Building Trust in Science through HPC Co-Design ? are particularly welcome. Papers accepted for PASC26 will be presented as talks, and published in the Proceedings of the PASC Conference, accessible via the ACM Digital Library. A selection of the highest quality papers may be given the opportunity of a plenary presentation. The PASC26 Papers Program Committee is responsible for the paper evaluation process. The committee is chaired by Sally Ellingson (University of Kentucky, US), and (Xiaoye) Sherry Li (Lawrence Berkeley National Laboratory, US), and comprises of domain co- chairs and committee members who are specialists in their scientific fields. Papers will be evaluated on their significance, technical soundness, originality, and quality of communication by reviewers listed as part of the PASC26 papers committee. Contributions must be submitted through the PASC Conference online submission portal (https://urldefense.us/v3/__https://submissions.pasc-__;!!G_uCfscf7eWS!baAlzMcWyIXHehP17zWMs1_F3yb-SiuH-gz0o1GYhKp5v_vP-5hydledIk3ukeJyRK24MFjHAlMGc09Ie4-GfcI$ conference.org). SUBMISSION DEADLINES The deadline for submissions for Papers is December 12, 2025 at 11:59 pm anywhere on earth (?AoE? or ?UTC-12?). * 12 December 2025: Deadline for full paper submissions (NO EXTENSIONS!) * 04 February 2026: Review notifications * 20 March 2026: Deadline for paper revisions * 24 April 2026: Acceptance notifications -------------- next part -------------- An HTML attachment was scrubbed... URL: From herbert.owen at bsc.es Thu Oct 16 05:07:08 2025 From: herbert.owen at bsc.es (howen) Date: Thu, 16 Oct 2025 12:07:08 +0200 Subject: [petsc-users] Petsc + nvhpc Message-ID: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es> Dear All, I am interfacing our CFD code (Fortran + OpenACC) to Petsc. Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler. I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21. I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. I would like to know, if you have experience with the Nvidia compiler. I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated. Best, Herbert Owen Senior Researcher, Dpt. Computer Applications in Science and Engineering Barcelona Supercomputing Center (BSC-CNS) Tel: +34 93 413 4038 Skype: herbert.owen https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!abuM7ozzUs7eISYBumHNxpvO2Tuy74KRM4-WWcunXHZVjQf1V032xQrCzTfC5vA_NM-35xMEZ9yJ8XK-3QFqjWBSWuUi$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Oct 16 11:30:43 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 16 Oct 2025 11:30:43 -0500 Subject: [petsc-users] Petsc + nvhpc In-Reply-To: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es> References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es> Message-ID: Hi, Herbert, I don't have much experience on OpenACC and PETSc CI doesn't have such tests. Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code? If you, then you can use the latest petsc code and make our debugging easier. Also, could you provide us with a test and instructions to reproduce the problem? Thanks! --Junchao Zhang On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear All, > > I am interfacing our CFD code (Fortran + OpenACC) to Petsc. > Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc > compiler. The Gnu compiler does not work well and we do not have access to > the Cray compiler. > > I already know that the latest version of Petsc does not compile with > nvhpc, I am therefore using version 3.21. > I get good results on the CPU both in serial and parallel (MPI). However, > the GPU implementation, that is what we are interested in, only work > correctly for the serial version. In parallel, the results are different. > Even for a CG solve. > > I would like to know, if you have experience with the Nvidia compiler. I > am particularly interested if you have already observed issues with it. > Your opinion on whether to put further effort into trying to find a bug I > may have introduced during the interfacing is highly appreciated. > > Best, > > Herbert Owen > Senior Researcher, Dpt. Computer Applications in Science and Engineering > Barcelona Supercomputing Center (BSC-CNS) > Tel: +34 93 413 4038 > Skype: herbert.owen > > https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!aTIjTK-Fcr9yL0t3yY0n7IDAF_kAHNA6X3j7omFcZJel4Laq7RWEgItjDi9CSvwBIXaih6jOEqyr6gaPlp-TJZZrsWBw$ > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Oct 16 12:06:56 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 16 Oct 2025 13:06:56 -0400 Subject: [petsc-users] Petsc + nvhpc In-Reply-To: References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es> Message-ID: <9055DAEB-E928-4834-8767-4621B49E98D0@petsc.dev> We recommend using PETSc 3.23.7 if you need to use the Nvidia Fortran compiler. Please try this version. If problems persist, could you send a reproducer as Junchao asked, and we will see if we can resolve the issue? Barry > On Oct 16, 2025, at 12:30?PM, Junchao Zhang wrote: > > Hi, Herbert, > I don't have much experience on OpenACC and PETSc CI doesn't have such tests. Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code? If you, then you can use the latest petsc code and make our debugging easier. > Also, could you provide us with a test and instructions to reproduce the problem? > > Thanks! > --Junchao Zhang > > > On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users > wrote: >> Dear All, >> >> I am interfacing our CFD code (Fortran + OpenACC) to Petsc. >> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler. >> >> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21. >> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. >> >> I would like to know, if you have experience with the Nvidia compiler. I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated. >> >> Best, >> >> Herbert Owen >> Senior Researcher, Dpt. Computer Applications in Science and Engineering >> Barcelona Supercomputing Center (BSC-CNS) >> Tel: +34 93 413 4038 >> Skype: herbert.owen >> >> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!d2WQKdHf7ZKmQE1-LcWY1bMjyOeWcReCH2MhT18ms2AwaQcqn_NDoozxqOtBhu843jkLZI_l4XEAm8KXBAIb3AY$ >> >> >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Thu Oct 16 12:39:28 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 16 Oct 2025 12:39:28 -0500 (CDT) Subject: [petsc-users] Petsc + nvhpc In-Reply-To: <9055DAEB-E928-4834-8767-4621B49E98D0@petsc.dev> References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es> <9055DAEB-E928-4834-8767-4621B49E98D0@petsc.dev> Message-ID: <962c210f-fd56-167f-ab6c-2937a2bccd08@fastmail.org> Actually petsc-3.22 (i.e. 3.22.5) is the last version that works with nvfortran Satish On Thu, 16 Oct 2025, Barry Smith wrote: > > We recommend using PETSc 3.23.7 if you need to use the Nvidia Fortran compiler. Please try this version. If problems persist, could you send a reproducer as Junchao asked, and we will see if we can resolve the issue? > > Barry > > > > On Oct 16, 2025, at 12:30?PM, Junchao Zhang wrote: > > > > Hi, Herbert, > > I don't have much experience on OpenACC and PETSc CI doesn't have such tests. Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code? If you, then you can use the latest petsc code and make our debugging easier. > > Also, could you provide us with a test and instructions to reproduce the problem? > > > > Thanks! > > --Junchao Zhang > > > > > > On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users > wrote: > >> Dear All, > >> > >> I am interfacing our CFD code (Fortran + OpenACC) to Petsc. > >> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler. > >> > >> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21. > >> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. > >> > >> I would like to know, if you have experience with the Nvidia compiler. I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated. > >> > >> Best, > >> > >> Herbert Owen > >> Senior Researcher, Dpt. Computer Applications in Science and Engineering > >> Barcelona Supercomputing Center (BSC-CNS) > >> Tel: +34 93 413 4038 > >> Skype: herbert.owen > >> > >> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!d2WQKdHf7ZKmQE1-LcWY1bMjyOeWcReCH2MhT18ms2AwaQcqn_NDoozxqOtBhu843jkLZI_l4XEAm8KXBAIb3AY$ > >> > >> > >> > >> > >> > >> > >> > >> > > From bsmith at petsc.dev Thu Oct 16 18:17:17 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 16 Oct 2025 19:17:17 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <739e8580-286c-4aa6-accf-05b8e2ff8cbb@ipp.mpg.de> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> <5687B985-6FB4-4DE0-9151-AEE034171185@petsc.dev> <739e8580-286c-4aa6-accf-05b8e2ff8cbb@ipp.mpg.de> Message-ID: I'm glad to hear that we finally got to the bottom of the problem you reported. Here is a quick explanation. When PCMG constructs the KSP solver for each level of multigrid, it explicitly turns off the convergence test (meaning that it will also iterate the number of iterations you requested, irrespective of how much the residuals drop). We do this because this is conventionally done in multigrid (for example, always apply three iterations of smoothing). Passing in the options I sent you turns the convergence test BACK ON for each level. If you wish to use a convergence test on each level, you need to provide those options. Is there a particular reason you want to use a convergence test on the smoothers instead of the conventional fixed number of iterations? Because of my confusion as to why the convergence criteria seemed to "ignored," I modified the KSP code so that in KSPView it now explicitly lists when no convergence test is being used (the previous output implied the convergence test was used if monitoring was turned on even if no convergence test was used). Thus, in the future, there should be less confusion since the KSPView output will now be clear on when convergence testing is not being used. Barry > On Oct 16, 2025, at 5:32?AM, Moral Sanchez, Elena wrote: > > Now the fine solver behaves as expected. This is what my callback function prints: > MG lvl 0 (s=884): CG Iter 0/15 | res = 7.95e-01/1.00e-01 | 0.3 s > MG lvl 0 (s=884): CG Iter 1/15 | res = 9.09e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 2/15 | res = 8.89e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 3/15 | res = 3.05e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 4/15 | res = 3.25e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 5/15 | res = 2.30e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 6/15 | res = 3.22e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 7/15 | res = 1.22e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 8/15 | res = 1.04e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 9/15 | res = 1.11e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 10/15 | res = 7.50e-02/1.00e-01 | 0.2 s > ConvergedReason MG lvl 0: 3 > MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-01/1.00e-01 | 2.3 s > MG lvl -1 (s=524): CG Iter 1/15 | res = 1.97e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 2/15 | res = 2.30e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 3/15 | res = 1.88e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 4/15 | res = 1.84e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 5/15 | res = 1.99e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 6/15 | res = 2.12e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 7/15 | res = 1.51e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 8/15 | res = 1.75e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 9/15 | res = 1.86e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 10/15 | res = 1.82e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 11/15 | res = 1.78e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 12/15 | res = 1.84e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 13/15 | res = 1.65e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 14/15 | res = 1.77e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 15/15 | res = 1.20e-01/1.00e-01 | 0.1 s > ConvergedReason MG lvl -1: -3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 5.18e+00/1.00e-01 | 1.3 s > MG lvl 0 (s=884): CG Iter 1/15 | res = 2.07e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 2/15 | res = 1.43e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 3/15 | res = 9.19e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 4/15 | res = 6.64e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 5/15 | res = 6.15e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 6/15 | res = 2.98e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 7/15 | res = 3.38e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 8/15 | res = 2.21e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 9/15 | res = 1.57e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 10/15 | res = 1.37e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 11/15 | res = 1.14e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 12/15 | res = 8.83e-02/1.00e-01 | 0.2 s > ConvergedReason MG lvl -1: -3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 8.66e-02/1.00e-01 | 0.2 s > ConvergedReason MG lvl 0: 3 > MG lvl -1 (s=524): CG Iter 0/15 | res = 1.82e-01/1.00e-01 | 2.7 s > MG lvl -1 (s=524): CG Iter 1/15 | res = 3.60e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 2/15 | res = 4.27e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 3/15 | res = 3.57e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 4/15 | res = 4.22e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 5/15 | res = 4.43e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 6/15 | res = 3.81e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 7/15 | res = 2.96e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 9/15 | res = 3.23e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 10/15 | res = 2.80e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 11/15 | res = 4.66e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 12/15 | res = 3.31e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 13/15 | res = 2.83e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 14/15 | res = 3.29e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 15/15 | res = 2.13e-01/1.00e-01 | 0.1 s > ConvergedReason MG lvl -1: -3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 8.91e+00/1.00e-01 | 1.7 s > MG lvl 0 (s=884): CG Iter 1/15 | res = 3.61e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 2/15 | res = 3.19e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 3/15 | res = 1.74e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 4/15 | res = 9.98e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 5/15 | res = 9.44e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 6/15 | res = 7.69e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 7/15 | res = 4.48e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 8/15 | res = 4.97e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 9/15 | res = 3.94e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 10/15 | res = 2.40e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 11/15 | res = 2.67e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 12/15 | res = 2.04e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 13/15 | res = 1.67e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 14/15 | res = 1.78e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 15/15 | res = 1.43e-01/1.00e-01 | 0.2 s > ConvergedReason MG lvl -1: -3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 7.26e-02/1.00e-01 | 0.2 s > ConvergedReason MG lvl 0: 3 > MG lvl -1 (s=524): CG Iter 0/15 | res = 1.54e-01/1.00e-01 | 3.5 s > MG lvl -1 (s=524): CG Iter 1/15 | res = 2.91e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 2/15 | res = 3.09e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 3/15 | res = 2.81e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 4/15 | res = 2.43e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 5/15 | res = 2.15e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 6/15 | res = 2.02e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 7/15 | res = 1.50e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 8/15 | res = 1.68e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 9/15 | res = 2.02e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 10/15 | res = 1.60e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 11/15 | res = 2.73e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 12/15 | res = 1.93e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 13/15 | res = 1.35e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 14/15 | res = 2.39e-01/1.00e-01 | 0.1 s > MG lvl -1 (s=524): CG Iter 15/15 | res = 1.64e-01/1.00e-01 | 0.1 s > ConvergedReason MG lvl -1: -3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 3.86e+00/1.00e-01 | 1.4 s > MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 2/15 | res = 1.71e+00/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 3/15 | res = 9.45e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 4/15 | res = 6.56e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 5/15 | res = 5.91e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 6/15 | res = 3.64e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 7/15 | res = 3.24e-01/1.00e-01 | 0.3 s > MG lvl 0 (s=884): CG Iter 8/15 | res = 3.45e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 9/15 | res = 2.20e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 10/15 | res = 1.51e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 11/15 | res = 1.56e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 12/15 | res = 1.17e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 13/15 | res = 1.01e-01/1.00e-01 | 0.2 s > MG lvl 0 (s=884): CG Iter 14/15 | res = 8.68e-02/1.00e-01 | 0.2 s > ConvergedReason MG lvl -1: -3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s > ConvergedReason MG lvl 0: 3 > MG lvl -1 (s=524): CG Iter 0/15 | res = 9.22e-02/1.00e-01 | 3.2 s > ConvergedReason MG lvl -1: 3 > MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s > However, when I run the file without the flags > -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned > then it behaves as before. > I am setting the unpreconditioned norm for the smoothers. The corresponding residual norms in the callback appear to be the same. So it seems like the residual norm is computed correctly but the convergence criterion is different. > > Elena > > On 10/14/25 19:19, Barry Smith wrote: >> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Oct 16 18:21:38 2025 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 16 Oct 2025 19:21:38 -0400 Subject: [petsc-users] Petsc + nvhpc In-Reply-To: <962c210f-fd56-167f-ab6c-2937a2bccd08@fastmail.org> References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es> <9055DAEB-E928-4834-8767-4621B49E98D0@petsc.dev> <962c210f-fd56-167f-ab6c-2937a2bccd08@fastmail.org> Message-ID: <586885F1-AF68-4AAE-9226-295532B365C2@petsc.dev> > On Oct 16, 2025, at 1:39?PM, Satish Balay wrote: > > Actually petsc-3.22 (i.e. 3.22.5) is the last version that works with nvfortran Oh, yikes. Sorry for providing the wrong information, definitely work with 3.22.5 to use nvfortran. Barry > > Satish > > On Thu, 16 Oct 2025, Barry Smith wrote: > >> >> We recommend using PETSc 3.23.7 if you need to use the Nvidia Fortran compiler. Please try this version. If problems persist, could you send a reproducer as Junchao asked, and we will see if we can resolve the issue? >> >> Barry >> >> >>> On Oct 16, 2025, at 12:30?PM, Junchao Zhang wrote: >>> >>> Hi, Herbert, >>> I don't have much experience on OpenACC and PETSc CI doesn't have such tests. Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code? If you, then you can use the latest petsc code and make our debugging easier. >>> Also, could you provide us with a test and instructions to reproduce the problem? >>> >>> Thanks! >>> --Junchao Zhang >>> >>> >>> On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users > wrote: >>>> Dear All, >>>> >>>> I am interfacing our CFD code (Fortran + OpenACC) to Petsc. >>>> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler. >>>> >>>> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21. >>>> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. >>>> >>>> I would like to know, if you have experience with the Nvidia compiler. I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated. >>>> >>>> Best, >>>> >>>> Herbert Owen >>>> Senior Researcher, Dpt. Computer Applications in Science and Engineering >>>> Barcelona Supercomputing Center (BSC-CNS) >>>> Tel: +34 93 413 4038 >>>> Skype: herbert.owen >>>> >>>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!d2WQKdHf7ZKmQE1-LcWY1bMjyOeWcReCH2MhT18ms2AwaQcqn_NDoozxqOtBhu843jkLZI_l4XEAm8KXBAIb3AY$ >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> From C.Klaij at marin.nl Fri Oct 17 03:37:21 2025 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 17 Oct 2025 08:37:21 +0000 Subject: [petsc-users] interpreting petsc streams result Message-ID: Attached is a petsc streams result kindly provided by a hardware vendor for a single compute node, dual socket, with two AMD epyc 9355 processors. Each processor has 32 cores, 12 DDR5 memory channels and mem BW around 600 GB/s. * It is not immediately clear which line corresponds to which y-axis. Could future versions of petsc please color the axis label with the matching line color? * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = 900 GB/s and not closer to 1200 GB/s? * The speed-up seems to be 12 out of 64, provided multiples of 8 cores are used. As expected given 12 memory channels? * Does the zig-zag pattern indicate a pinning problem, or is it unavoidable given the 8 core building block of these type of processors? Chris dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsmmm_T4I$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image918583.png Type: image/png Size: 5004 bytes Desc: image918583.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image489453.png Type: image/png Size: 487 bytes Desc: image489453.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image542102.png Type: image/png Size: 504 bytes Desc: image542102.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image611951.png Type: image/png Size: 482 bytes Desc: image611951.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PETSc_streams_9355_scaling.png Type: image/png Size: 55732 bytes Desc: PETSc_streams_9355_scaling.png URL: From junchao.zhang at gmail.com Fri Oct 17 10:01:56 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 17 Oct 2025 10:01:56 -0500 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: Hi, Chris, I did have an MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!fWNsAOkuZRiMn1TuiZ0HasNdskk5heIHlt3O4unVNFd3mnPlFFPISeieHQ_DFsrasG1dwtpASUuFiR6eUOugJNvoDVDy$ to improve mpistream. I should rework it after Barry's !6903. See my inlined comments to your questions On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users < petsc-users at mcs.anl.gov> wrote: > Attached is a petsc streams result kindly provided by a hardware > vendor for a single compute node, dual socket, with two AMD epyc > 9355 processors. Each processor has 32 cores, 12 DDR5 memory > channels and mem BW around 600 GB/s. > > * It is not immediately clear which line corresponds to which > y-axis. Could future versions of petsc please color the axis > label with the matching line color? > definitely > > > * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = > 900 GB/s and not closer to 1200 GB/s? > I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc. > > > * The speed-up seems to be 12 out of 64, provided multiples of 8 > cores are used. As expected given 12 memory channels? > Maybe not, otherwise the speedup should be 24 as you have 24 channels. > > * Does the zig-zag pattern indicate a pinning problem, or is it > unavoidable given the 8 core building block of these type of > processors? > I checked and found "make mpistream" uses --map-by core. I think we should use --map-by socket or --map-by l3cache. > > Chris > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 <+31%20317%2049%2033%2044> | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!fWNsAOkuZRiMn1TuiZ0HasNdskk5heIHlt3O4unVNFd3mnPlFFPISeieHQ_DFsrasG1dwtpASUuFiR6eUOugJD65Q0mk$ > > [image: Facebook] > > [image: LinkedIn] > > [image: YouTube] > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image918583.png Type: image/png Size: 5004 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image489453.png Type: image/png Size: 487 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image542102.png Type: image/png Size: 504 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image611951.png Type: image/png Size: 482 bytes Desc: not available URL: From bsmith at petsc.dev Fri Oct 17 16:27:19 2025 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 17 Oct 2025 17:27:19 -0400 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr> <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> Message-ID: <561C2914-04D1-48D0-8BC4-E5F40FEB1C05@petsc.dev> I have updated the MR with what I think is now correct code for computing the diagonal on the GPU, could you please try it again and let me know if it works and how much time it saves (I think it is should be significant). Thankts for your patients, Barry > On Oct 2, 2025, at 1:16?AM, LEDAC Pierre wrote: > > Yes, probably the reason I saw also a crash in my test case after a quick fix of the integer conversion. > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?41 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > De : Barry Smith > > Envoy? : jeudi 2 octobre 2025 02:16:40 > ? : LEDAC Pierre > Cc : Junchao Zhang; petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] [GPU] Jacobi preconditioner > > > Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. > > Barry > > >> On Oct 1, 2025, at 3:47?PM, LEDAC Pierre > wrote: >> >> Sorry the correct error is: >> >> /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" >> GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); >> >> >> Pierre LEDAC >> Commissariat ? l??nergie atomique et aux ?nergies alternatives >> Centre de SACLAY >> DES/ISAS/DM2S/SGLS/LCAN >> B?timent 451 ? point courrier n?41 >> F-91191 Gif-sur-Yvette >> +33 1 69 08 04 03 >> +33 6 83 42 05 79 >> De : LEDAC Pierre >> Envoy? : mercredi 1 octobre 2025 21:46:00 >> ? : Barry Smith >> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >> Objet : RE: [petsc-users] [GPU] Jacobi preconditioner >> >> Hi all, >> >> Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: >> >> /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" >> GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); >> >> Thanks, >> >> Pierre LEDAC >> Commissariat ? l??nergie atomique et aux ?nergies alternatives >> Centre de SACLAY >> DES/ISAS/DM2S/SGLS/LCAN >> B?timent 451 ? point courrier n?41 >> F-91191 Gif-sur-Yvette >> +33 1 69 08 04 03 >> +33 6 83 42 05 79 >> De : Barry Smith > >> Envoy? : mercredi 1 octobre 2025 18:48:37 >> ? : LEDAC Pierre >> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >> >> >> I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!dKYXeFfmzkN4Tc2oZZQPw_GL8AwkmzFXhYH0TzM4Dtrf72oWmFpj8bmmF-L_Fl784As9NbsIqk1pM9g-xSi5XZo$ >> >> Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. >> >> Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. >> >> Barry >> >>> On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: >>> >>> Thanks Barry, I agree but didn't dare asking for that. >>> >>> Pierre LEDAC >>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>> Centre de SACLAY >>> DES/ISAS/DM2S/SGLS/LCAN >>> B?timent 451 ? point courrier n?41 >>> F-91191 Gif-sur-Yvette >>> +33 1 69 08 04 03 >>> +33 6 83 42 05 79 >>> >>> De : Barry Smith > >>> Envoy? : mercredi 30 juillet 2025 20:34:26 >>> ? : Junchao Zhang >>> Cc : LEDAC Pierre; petsc-users at mcs.anl.gov >>> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >>> >>> >>> We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. >>> >>> I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!dKYXeFfmzkN4Tc2oZZQPw_GL8AwkmzFXhYH0TzM4Dtrf72oWmFpj8bmmF-L_Fl784As9NbsIqk1pM9g-gN1IJTI$ >>> >>> Barry >>> >>> >>> >>> >>>> On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: >>>> >>>> Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. >>>> If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: >>>>> Hello all, >>>>> >>>>> We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). >>>>> >>>>> The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. >>>>> >>>>> It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). >>>>> >>>>> Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? >>>>> NB: Gmres is running well on device. >>>>> >>>>> I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. >>>>> >>>>> Thanks, >>>>> >>>>> >>>>> >>>>> >>>>> Pierre LEDAC >>>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>>> Centre de SACLAY >>>>> DES/ISAS/DM2S/SGLS/LCAN >>>>> B?timent 451 ? point courrier n?41 >>>>> F-91191 Gif-sur-Yvette >>>>> +33 1 69 08 04 03 >>>>> +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Elena.Moral.Sanchez at ipp.mpg.de Mon Oct 20 04:19:23 2025 From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena) Date: Mon, 20 Oct 2025 09:19:23 +0000 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> <5687B985-6FB4-4DE0-9151-AEE034171185@petsc.dev> <739e8580-286c-4aa6-accf-05b8e2ff8cbb@ipp.mpg.de>, Message-ID: <19ef9d3a9b5947c3bf1fcfa2af67f17c@ipp.mpg.de> Dear Barry, thank you for the clear explanation and the fix. Indeed, the fact that the convergence test is turned off for the KSP smoothers should be indicated somewhere. The reason why I wanted to run the convergence test on the smoothers was because I wanted to study the effect of not inverting the smoother matrix completely, so I could choose an optimal maximum number of iterations. Now that I understand how it works, I will use the preconditioner accordingly. Thanks for the help. Best regards, Elena ________________________________ From: Barry Smith Sent: 17 October 2025 01:17:17 To: Moral Sanchez, Elena Cc: PETSc Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level I'm glad to hear that we finally got to the bottom of the problem you reported. Here is a quick explanation. When PCMG constructs the KSP solver for each level of multigrid, it explicitly turns off the convergence test (meaning that it will also iterate the number of iterations you requested, irrespective of how much the residuals drop). We do this because this is conventionally done in multigrid (for example, always apply three iterations of smoothing). Passing in the options I sent you turns the convergence test BACK ON for each level. If you wish to use a convergence test on each level, you need to provide those options. Is there a particular reason you want to use a convergence test on the smoothers instead of the conventional fixed number of iterations? Because of my confusion as to why the convergence criteria seemed to "ignored," I modified the KSP code so that in KSPView it now explicitly lists when no convergence test is being used (the previous output implied the convergence test was used if monitoring was turned on even if no convergence test was used). Thus, in the future, there should be less confusion since the KSPView output will now be clear on when convergence testing is not being used. Barry On Oct 16, 2025, at 5:32?AM, Moral Sanchez, Elena wrote: Now the fine solver behaves as expected. This is what my callback function prints: MG lvl 0 (s=884): CG Iter 0/15 | res = 7.95e-01/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 9.09e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 8.89e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 3.05e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 3.25e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 2.30e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 3.22e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 1.22e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 1.04e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.11e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 7.50e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 3 MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-01/1.00e-01 | 2.3 s MG lvl -1 (s=524): CG Iter 1/15 | res = 1.97e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 2/15 | res = 2.30e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 3/15 | res = 1.88e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 4/15 | res = 1.84e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 5/15 | res = 1.99e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 6/15 | res = 2.12e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 7/15 | res = 1.51e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 8/15 | res = 1.75e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 9/15 | res = 1.86e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 10/15 | res = 1.82e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 11/15 | res = 1.78e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 12/15 | res = 1.84e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 13/15 | res = 1.65e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 14/15 | res = 1.77e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 15/15 | res = 1.20e-01/1.00e-01 | 0.1 s ConvergedReason MG lvl -1: -3 MG lvl 0 (s=884): CG Iter 0/15 | res = 5.18e+00/1.00e-01 | 1.3 s MG lvl 0 (s=884): CG Iter 1/15 | res = 2.07e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.43e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 9.19e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 6.64e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 6.15e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 2.98e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.38e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 2.21e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 1.57e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 1.37e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 11/15 | res = 1.14e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 12/15 | res = 8.83e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl -1: -3 MG lvl 0 (s=884): CG Iter 0/15 | res = 8.66e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 3 MG lvl -1 (s=524): CG Iter 0/15 | res = 1.82e-01/1.00e-01 | 2.7 s MG lvl -1 (s=524): CG Iter 1/15 | res = 3.60e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 2/15 | res = 4.27e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 3/15 | res = 3.57e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 4/15 | res = 4.22e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 5/15 | res = 4.43e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 6/15 | res = 3.81e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 7/15 | res = 2.96e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 9/15 | res = 3.23e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 10/15 | res = 2.80e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 11/15 | res = 4.66e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 12/15 | res = 3.31e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 13/15 | res = 2.83e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 14/15 | res = 3.29e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 15/15 | res = 2.13e-01/1.00e-01 | 0.1 s ConvergedReason MG lvl -1: -3 MG lvl 0 (s=884): CG Iter 0/15 | res = 8.91e+00/1.00e-01 | 1.7 s MG lvl 0 (s=884): CG Iter 1/15 | res = 3.61e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 3.19e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 1.74e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 9.98e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 9.44e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 7.69e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 4.48e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 8/15 | res = 4.97e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 3.94e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 2.40e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 11/15 | res = 2.67e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 12/15 | res = 2.04e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 13/15 | res = 1.67e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 1.78e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 15/15 | res = 1.43e-01/1.00e-01 | 0.2 s ConvergedReason MG lvl -1: -3 MG lvl 0 (s=884): CG Iter 0/15 | res = 7.26e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 3 MG lvl -1 (s=524): CG Iter 0/15 | res = 1.54e-01/1.00e-01 | 3.5 s MG lvl -1 (s=524): CG Iter 1/15 | res = 2.91e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 2/15 | res = 3.09e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 3/15 | res = 2.81e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 4/15 | res = 2.43e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 5/15 | res = 2.15e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 6/15 | res = 2.02e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 7/15 | res = 1.50e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 8/15 | res = 1.68e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 9/15 | res = 2.02e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 10/15 | res = 1.60e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 11/15 | res = 2.73e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 12/15 | res = 1.93e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 13/15 | res = 1.35e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 14/15 | res = 2.39e-01/1.00e-01 | 0.1 s MG lvl -1 (s=524): CG Iter 15/15 | res = 1.64e-01/1.00e-01 | 0.1 s ConvergedReason MG lvl -1: -3 MG lvl 0 (s=884): CG Iter 0/15 | res = 3.86e+00/1.00e-01 | 1.4 s MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 2/15 | res = 1.71e+00/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 3/15 | res = 9.45e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 4/15 | res = 6.56e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 5/15 | res = 5.91e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 6/15 | res = 3.64e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 7/15 | res = 3.24e-01/1.00e-01 | 0.3 s MG lvl 0 (s=884): CG Iter 8/15 | res = 3.45e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 9/15 | res = 2.20e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 10/15 | res = 1.51e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 11/15 | res = 1.56e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 12/15 | res = 1.17e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 13/15 | res = 1.01e-01/1.00e-01 | 0.2 s MG lvl 0 (s=884): CG Iter 14/15 | res = 8.68e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl -1: -3 MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s ConvergedReason MG lvl 0: 3 MG lvl -1 (s=524): CG Iter 0/15 | res = 9.22e-02/1.00e-01 | 3.2 s ConvergedReason MG lvl -1: 3 MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s However, when I run the file without the flags -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned then it behaves as before. I am setting the unpreconditioned norm for the smoothers. The corresponding residual norms in the callback appear to be the same. So it seems like the residual norm is computed correctly but the convergence criterion is different. Elena On 10/14/25 19:19, Barry Smith wrote: -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Oct 20 08:48:57 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 20 Oct 2025 09:48:57 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <19ef9d3a9b5947c3bf1fcfa2af67f17c@ipp.mpg.de> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> <5687B985-6FB4-4DE0-9151-AEE034171185@petsc.dev> <739e8580-286c-4aa6-accf-05b8e2ff8cbb@ipp.mpg.de> <19ef9d3a9b5947c3bf1fcfa2af67f17c@ipp.mpg.de> Message-ID: <6F446B65-6BA7-43EF-9223-C547865CD2BB@petsc.dev> No problem, what you requested was reasonable to expect. Usually multigrid users make some test runs with different number of smoothing steps and then select the particular number of smoothing steps that minimizes the time. Barry > On Oct 20, 2025, at 5:19?AM, Moral Sanchez, Elena wrote: > > Dear Barry, > thank you for the clear explanation and the fix. Indeed, the fact that the convergence test is turned off for the KSP smoothers should be indicated somewhere. > > The reason why I wanted to run the convergence test on the smoothers was because I wanted to study the effect of not inverting the smoother matrix completely, so I could choose an optimal maximum number of iterations. > Now that I understand how it works, I will use the preconditioner accordingly. > > Thanks for the help. > Best regards, > Elena > > From: Barry Smith > > Sent: 17 October 2025 01:17:17 > To: Moral Sanchez, Elena > Cc: PETSc > Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level > > > I'm glad to hear that we finally got to the bottom of the problem you reported. Here is a quick explanation. > > When PCMG constructs the KSP solver for each level of multigrid, it explicitly turns off the convergence test (meaning that it will also iterate the number of iterations you requested, irrespective of how much the residuals drop). We do this because this is conventionally done in multigrid (for example, always apply three iterations of smoothing). Passing in the options I sent you turns the convergence test BACK ON for each level. If you wish to use a convergence test on each level, you need to provide those options. Is there a particular reason you want to use a convergence test on the smoothers instead of the conventional fixed number of iterations? > > Because of my confusion as to why the convergence criteria seemed to "ignored," I modified the KSP code so that in KSPView it now explicitly lists when no convergence test is being used (the previous output implied the convergence test was used if monitoring was turned on even if no convergence test was used). Thus, in the future, there should be less confusion since the KSPView output will now be clear on when convergence testing is not being used. > > Barry > > > >> On Oct 16, 2025, at 5:32?AM, Moral Sanchez, Elena > wrote: >> >> Now the fine solver behaves as expected. This is what my callback function prints: >> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.95e-01/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 9.09e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 8.89e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 3.05e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 3.25e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 2.30e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.22e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.22e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.04e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.11e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.50e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-01/1.00e-01 | 2.3 s >> MG lvl -1 (s=524): CG Iter 1/15 | res = 1.97e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 2/15 | res = 2.30e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 3/15 | res = 1.88e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 4/15 | res = 1.84e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 5/15 | res = 1.99e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 6/15 | res = 2.12e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 7/15 | res = 1.51e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 8/15 | res = 1.75e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 9/15 | res = 1.86e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 10/15 | res = 1.82e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 11/15 | res = 1.78e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 12/15 | res = 1.84e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 13/15 | res = 1.65e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 14/15 | res = 1.77e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 15/15 | res = 1.20e-01/1.00e-01 | 0.1 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.18e+00/1.00e-01 | 1.3 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.07e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.43e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 9.19e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.64e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 6.15e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.98e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.38e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.21e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.57e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.37e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 1.14e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.83e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 8.66e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.82e-01/1.00e-01 | 2.7 s >> MG lvl -1 (s=524): CG Iter 1/15 | res = 3.60e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 2/15 | res = 4.27e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 3/15 | res = 3.57e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 4/15 | res = 4.22e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 5/15 | res = 4.43e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 6/15 | res = 3.81e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 7/15 | res = 2.96e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 9/15 | res = 3.23e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 10/15 | res = 2.80e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 11/15 | res = 4.66e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 12/15 | res = 3.31e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 13/15 | res = 2.83e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 14/15 | res = 3.29e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 15/15 | res = 2.13e-01/1.00e-01 | 0.1 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 8.91e+00/1.00e-01 | 1.7 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 3.61e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.19e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.74e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 9.98e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 9.44e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 7.69e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 4.48e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.97e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.94e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 2.40e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.67e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.04e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.67e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 1.78e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 15/15 | res = 1.43e-01/1.00e-01 | 0.2 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.26e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.54e-01/1.00e-01 | 3.5 s >> MG lvl -1 (s=524): CG Iter 1/15 | res = 2.91e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 2/15 | res = 3.09e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 3/15 | res = 2.81e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 4/15 | res = 2.43e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 5/15 | res = 2.15e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 6/15 | res = 2.02e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 7/15 | res = 1.50e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 8/15 | res = 1.68e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 9/15 | res = 2.02e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 10/15 | res = 1.60e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 11/15 | res = 2.73e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 12/15 | res = 1.93e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 13/15 | res = 1.35e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 14/15 | res = 2.39e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 15/15 | res = 1.64e-01/1.00e-01 | 0.1 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 3.86e+00/1.00e-01 | 1.4 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.71e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 9.45e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.56e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 5.91e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.64e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.24e-01/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 3.45e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 2.20e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.51e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 1.56e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.17e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.01e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.68e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 9.22e-02/1.00e-01 | 3.2 s >> ConvergedReason MG lvl -1: 3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s >> However, when I run the file without the flags >> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned >> then it behaves as before. >> I am setting the unpreconditioned norm for the smoothers. The corresponding residual norms in the callback appear to be the same. So it seems like the residual norm is computed correctly but the convergence criterion is different. >> >> Elena >> >> On 10/14/25 19:19, Barry Smith wrote: >>> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Oct 20 08:48:57 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 20 Oct 2025 09:48:57 -0400 Subject: [petsc-users] setting correct tolerances for MG smoother CG at the finest level In-Reply-To: <19ef9d3a9b5947c3bf1fcfa2af67f17c@ipp.mpg.de> References: <421fd9ac0ed0437f88e921d063a6f45f@ipp.mpg.de> <2622f5910bef400f983345df49977fa8@ipp.mpg.de> <67889c32cacf4cf3ac7e7b643297863b@ipp.mpg.de> <608352C7-1016-4E35-A099-33D81BC24739@petsc.dev> <7a2e2fbfa156446dbf2ff01ea0585bf2@ipp.mpg.de> <074e8f6971ae4abb89aeaeba57a76542@ipp.mpg.de> <3ABEF5FA-79BC-4DB3-B61D-04CE7A352D70@petsc.dev> <98106a9a5fbb4129b9d0aa8721805b18@ipp.mpg.de> <5687B985-6FB4-4DE0-9151-AEE034171185@petsc.dev> <739e8580-286c-4aa6-accf-05b8e2ff8cbb@ipp.mpg.de> <19ef9d3a9b5947c3bf1fcfa2af67f17c@ipp.mpg.de> Message-ID: <6F446B65-6BA7-43EF-9223-C547865CD2BB@petsc.dev> No problem, what you requested was reasonable to expect. Usually multigrid users make some test runs with different number of smoothing steps and then select the particular number of smoothing steps that minimizes the time. Barry > On Oct 20, 2025, at 5:19?AM, Moral Sanchez, Elena wrote: > > Dear Barry, > thank you for the clear explanation and the fix. Indeed, the fact that the convergence test is turned off for the KSP smoothers should be indicated somewhere. > > The reason why I wanted to run the convergence test on the smoothers was because I wanted to study the effect of not inverting the smoother matrix completely, so I could choose an optimal maximum number of iterations. > Now that I understand how it works, I will use the preconditioner accordingly. > > Thanks for the help. > Best regards, > Elena > > From: Barry Smith > > Sent: 17 October 2025 01:17:17 > To: Moral Sanchez, Elena > Cc: PETSc > Subject: Re: [petsc-users] setting correct tolerances for MG smoother CG at the finest level > > > I'm glad to hear that we finally got to the bottom of the problem you reported. Here is a quick explanation. > > When PCMG constructs the KSP solver for each level of multigrid, it explicitly turns off the convergence test (meaning that it will also iterate the number of iterations you requested, irrespective of how much the residuals drop). We do this because this is conventionally done in multigrid (for example, always apply three iterations of smoothing). Passing in the options I sent you turns the convergence test BACK ON for each level. If you wish to use a convergence test on each level, you need to provide those options. Is there a particular reason you want to use a convergence test on the smoothers instead of the conventional fixed number of iterations? > > Because of my confusion as to why the convergence criteria seemed to "ignored," I modified the KSP code so that in KSPView it now explicitly lists when no convergence test is being used (the previous output implied the convergence test was used if monitoring was turned on even if no convergence test was used). Thus, in the future, there should be less confusion since the KSPView output will now be clear on when convergence testing is not being used. > > Barry > > > >> On Oct 16, 2025, at 5:32?AM, Moral Sanchez, Elena > wrote: >> >> Now the fine solver behaves as expected. This is what my callback function prints: >> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.95e-01/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 9.09e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 8.89e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 3.05e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 3.25e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 2.30e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.22e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 1.22e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 1.04e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.11e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 7.50e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.38e-01/1.00e-01 | 2.3 s >> MG lvl -1 (s=524): CG Iter 1/15 | res = 1.97e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 2/15 | res = 2.30e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 3/15 | res = 1.88e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 4/15 | res = 1.84e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 5/15 | res = 1.99e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 6/15 | res = 2.12e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 7/15 | res = 1.51e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 8/15 | res = 1.75e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 9/15 | res = 1.86e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 10/15 | res = 1.82e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 11/15 | res = 1.78e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 12/15 | res = 1.84e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 13/15 | res = 1.65e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 14/15 | res = 1.77e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 15/15 | res = 1.20e-01/1.00e-01 | 0.1 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.18e+00/1.00e-01 | 1.3 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 2.07e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.43e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 9.19e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.64e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 6.15e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 2.98e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.38e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 2.21e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 1.57e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.37e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 1.14e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 8.83e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 8.66e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.82e-01/1.00e-01 | 2.7 s >> MG lvl -1 (s=524): CG Iter 1/15 | res = 3.60e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 2/15 | res = 4.27e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 3/15 | res = 3.57e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 4/15 | res = 4.22e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 5/15 | res = 4.43e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 6/15 | res = 3.81e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 7/15 | res = 2.96e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 8/15 | res = 2.78e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 9/15 | res = 3.23e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 10/15 | res = 2.80e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 11/15 | res = 4.66e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 12/15 | res = 3.31e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 13/15 | res = 2.83e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 14/15 | res = 3.29e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 15/15 | res = 2.13e-01/1.00e-01 | 0.1 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 8.91e+00/1.00e-01 | 1.7 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 3.61e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 3.19e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 1.74e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 9.98e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 9.44e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 7.69e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 4.48e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 4.97e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 3.94e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 2.40e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 2.67e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 2.04e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.67e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 1.78e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 15/15 | res = 1.43e-01/1.00e-01 | 0.2 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 7.26e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 1.54e-01/1.00e-01 | 3.5 s >> MG lvl -1 (s=524): CG Iter 1/15 | res = 2.91e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 2/15 | res = 3.09e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 3/15 | res = 2.81e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 4/15 | res = 2.43e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 5/15 | res = 2.15e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 6/15 | res = 2.02e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 7/15 | res = 1.50e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 8/15 | res = 1.68e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 9/15 | res = 2.02e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 10/15 | res = 1.60e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 11/15 | res = 2.73e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 12/15 | res = 1.93e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 13/15 | res = 1.35e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 14/15 | res = 2.39e-01/1.00e-01 | 0.1 s >> MG lvl -1 (s=524): CG Iter 15/15 | res = 1.64e-01/1.00e-01 | 0.1 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 3.86e+00/1.00e-01 | 1.4 s >> MG lvl 0 (s=884): CG Iter 1/15 | res = 1.76e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 2/15 | res = 1.71e+00/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 3/15 | res = 9.45e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 4/15 | res = 6.56e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 5/15 | res = 5.91e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 6/15 | res = 3.64e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 7/15 | res = 3.24e-01/1.00e-01 | 0.3 s >> MG lvl 0 (s=884): CG Iter 8/15 | res = 3.45e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 9/15 | res = 2.20e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 10/15 | res = 1.51e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 11/15 | res = 1.56e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 12/15 | res = 1.17e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 13/15 | res = 1.01e-01/1.00e-01 | 0.2 s >> MG lvl 0 (s=884): CG Iter 14/15 | res = 8.68e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl -1: -3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s >> ConvergedReason MG lvl 0: 3 >> MG lvl -1 (s=524): CG Iter 0/15 | res = 9.22e-02/1.00e-01 | 3.2 s >> ConvergedReason MG lvl -1: 3 >> MG lvl 0 (s=884): CG Iter 0/15 | res = 5.32e-02/1.00e-01 | 0.2 s >> However, when I run the file without the flags >> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned >> then it behaves as before. >> I am setting the unpreconditioned norm for the smoothers. The corresponding residual norms in the callback appear to be the same. So it seems like the residual norm is computed correctly but the convergence criterion is different. >> >> Elena >> >> On 10/14/25 19:19, Barry Smith wrote: >>> -mg_levels_ksp_convergence_test default -mg_levels_ksp_norm_type unpreconditioned -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Mon Oct 20 09:45:29 2025 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 20 Oct 2025 14:45:29 +0000 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: Hi Junchao, Thanks for you answer. Regarding the speed-up what would you expect if not 24 out of 64, and why? Chris _____ dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNT-LxhuP4$ ___________________________________ From: Junchao Zhang Sent: Friday, October 17, 2025 5:01 PM To: Klaij, Christiaan Cc: PETSc users list Subject: Re: [petsc-users] interpreting petsc streams result Hi, Chris, I did have an MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNTuKitErU$ to improve mpistream. I should rework it after Barry's !6903. See my inlined comments to your questions On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users > wrote: Attached is a petsc streams result kindly provided by a hardware vendor for a single compute node, dual socket, with two AMD epyc 9355 processors. Each processor has 32 cores, 12 DDR5 memory channels and mem BW around 600 GB/s. * It is not immediately clear which line corresponds to which y-axis. Could future versions of petsc please color the axis label with the matching line color? definitely * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = 900 GB/s and not closer to 1200 GB/s? I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc. * The speed-up seems to be 12 out of 64, provided multiples of 8 cores are used. As expected given 12 memory channels? Maybe not, otherwise the speedup should be 24 as you have 24 channels. * Does the zig-zag pattern indicate a pinning problem, or is it unavoidable given the 8 core building block of these type of processors? I checked and found "make mpistream" uses --map-by core. I think we should use --map-by socket or --map-by l3cache. Chris [cid:ii_199f2a38566119b24a61] dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNT-LxhuP4$ [Facebook] [LinkedIn] [YouTube] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image014564.png Type: image/png Size: 5004 bytes Desc: image014564.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image076541.png Type: image/png Size: 487 bytes Desc: image076541.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image600589.png Type: image/png Size: 504 bytes Desc: image600589.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image758659.png Type: image/png Size: 482 bytes Desc: image758659.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image918583.png Type: image/png Size: 5004 bytes Desc: image918583.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image489453.png Type: image/png Size: 487 bytes Desc: image489453.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image542102.png Type: image/png Size: 504 bytes Desc: image542102.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image611951.png Type: image/png Size: 482 bytes Desc: image611951.png URL: From junchao.zhang at gmail.com Mon Oct 20 11:36:11 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 20 Oct 2025 11:36:11 -0500 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: Hi, Chris, Since we compute the speed up off the bandwidth achieved by a single MPI process, and a process can drive all memory channels, the maximum speed up can only come from experiments (vs. not by # of memory channels). --Junchao Zhang On Mon, Oct 20, 2025 at 9:45?AM Klaij, Christiaan wrote: > Hi Junchao, > > Thanks for you answer. Regarding the speed-up what would you expect if not > 24 out of 64, and why? > > Chris > > ________________________________________ > ???? > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 <+31%20317%2049%2033%2044> | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUomu-nz$ > [image: Facebook] > [image: LinkedIn] > [image: YouTube] > > From: Junchao Zhang > Sent: Friday, October 17, 2025 5:01 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > Hi, Chris, > I did have an MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERYIUu7jT$ to > improve mpistream. I should rework it after Barry's !6903. See my inlined > comments to your questions > > On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > Attached is a petsc streams result kindly provided by a hardware > vendor for a single compute node, dual socket, with two AMD epyc > 9355 processors. Each processor has 32 cores, 12 DDR5 memory > channels and mem BW around 600 GB/s. > > * It is not immediately clear which line corresponds to which > y-axis. Could future versions of petsc please color the axis > label with the matching line color? > definitely > > > * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = > 900 GB/s and not closer to 1200 GB/s? > I recall it is actually not simple to get the theoretical max bandwidth. > One has to use special SIMD instructions, compiler flags and streaming > stores etc. > > > * The speed-up seems to be 12 out of 64, provided multiples of 8 > cores are used. As expected given 12 memory channels? > Maybe not, otherwise the speedup should be 24 as you have 24 channels. > > > * Does the zig-zag pattern indicate a pinning problem, or is it > unavoidable given the 8 core building block of these type of > processors? > I checked and found "make mpistream" uses --map-by core. I think we should > use --map-by socket or --map-by l3cache. > > > Chris > [cid:ii_199f2a38566119b24a61] > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUomu-nz$ < > https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$ > > > [Facebook]< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$ > > > [LinkedIn]< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$ > > > [YouTube]< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image014564.png Type: image/png Size: 5004 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image076541.png Type: image/png Size: 487 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image600589.png Type: image/png Size: 504 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image758659.png Type: image/png Size: 482 bytes Desc: not available URL: From Pierre.LEDAC at cea.fr Tue Oct 21 04:55:43 2025 From: Pierre.LEDAC at cea.fr (LEDAC Pierre) Date: Tue, 21 Oct 2025 09:55:43 +0000 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: <561C2914-04D1-48D0-8BC4-E5F40FEB1C05@petsc.dev> References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr> <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> , <561C2914-04D1-48D0-8BC4-E5F40FEB1C05@petsc.dev> Message-ID: <2591b65a8cce45969fd80150db47c01f@cea.fr> Hello, Thanks for the work ! It is ok now, i check with Nsight system, the diagonal is indeed computed on the device. How much time it saves ? I guess it depends of the number of iterations for Gmres, the lower, the more it is significant. In my case, with 5 158 400 rows for the matrix, 45 iterations of GMRES, time to solve decrease 1.160s from to 0.671s on a RTX A6000. So thanks again, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith Envoy? : vendredi 17 octobre 2025 23:27:19 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have updated the MR with what I think is now correct code for computing the diagonal on the GPU, could you please try it again and let me know if it works and how much time it saves (I think it is should be significant). Thankts for your patients, Barry On Oct 2, 2025, at 1:16?AM, LEDAC Pierre wrote: Yes, probably the reason I saw also a crash in my test case after a quick fix of the integer conversion. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : jeudi 2 octobre 2025 02:16:40 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. Barry On Oct 1, 2025, at 3:47?PM, LEDAC Pierre > wrote: Sorry the correct error is: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : LEDAC Pierre Envoy? : mercredi 1 octobre 2025 21:46:00 ? : Barry Smith Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : RE: [petsc-users] [GPU] Jacobi preconditioner Hi all, Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 1 octobre 2025 18:48:37 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!Ye5L6Vm4r3ziHELznfrHwXU-3hoIoIvb2jQJZIvCmrn6qFOZy3QWPIK8p_liz8QA0VhQ8dFa4wTYbtfX8bdq9vmoQYUK$ Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. Barry On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: Thanks Barry, I agree but didn't dare asking for that. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 30 juillet 2025 20:34:26 ? : Junchao Zhang Cc : LEDAC Pierre; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!Ye5L6Vm4r3ziHELznfrHwXU-3hoIoIvb2jQJZIvCmrn6qFOZy3QWPIK8p_liz8QA0VhQ8dFa4wTYbtfX8bdq9qtBuYkt$ Barry On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. --Junchao Zhang On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: Hello all, We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? NB: Gmres is running well on device. I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Tue Oct 21 06:17:34 2025 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 21 Oct 2025 11:17:34 +0000 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: OK, experiments will have to wait till we get the hardware. Can you give me a sign when you are done with the merge request? I would like to try with the increased array size, other vendors already warned me that "the array in stream is quiet small". Chris ________________________________________ From: Junchao Zhang Sent: Monday, October 20, 2025 6:36 PM To: Klaij, Christiaan Cc: PETSc users list Subject: Re: [petsc-users] interpreting petsc streams result Hi, Chris, Since we compute the speed up off the bandwidth achieved by a single MPI process, and a process can drive all memory channels, the maximum speed up can only come from experiments (vs. not by # of memory channels). --Junchao Zhang On Mon, Oct 20, 2025 at 9:45?AM Klaij, Christiaan > wrote: Hi Junchao, Thanks for you answer. Regarding the speed-up what would you expect if not 24 out of 64, and why? Chris ________________________________________ [cid:ii_19a027041d3d825dd561] ???? dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!bqDJffrr9mQhcYcNK2URWAz71Tks15aiJt7CAtN6yyLdHq8UoaJM_3qbJQnYBLO1ex08X6mU0GgYmJFwaUY3YwY$ [Facebook] [LinkedIn] [YouTube] From: Junchao Zhang > Sent: Friday, October 17, 2025 5:01 PM To: Klaij, Christiaan Cc: PETSc users list Subject: Re: [petsc-users] interpreting petsc streams result Hi, Chris, I did have an MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!bqDJffrr9mQhcYcNK2URWAz71Tks15aiJt7CAtN6yyLdHq8UoaJM_3qbJQnYBLO1ex08X6mU0GgYmJFwjVbU_d4$ to improve mpistream. I should rework it after Barry's !6903. See my inlined comments to your questions On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users >> wrote: Attached is a petsc streams result kindly provided by a hardware vendor for a single compute node, dual socket, with two AMD epyc 9355 processors. Each processor has 32 cores, 12 DDR5 memory channels and mem BW around 600 GB/s. * It is not immediately clear which line corresponds to which y-axis. Could future versions of petsc please color the axis label with the matching line color? definitely * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = 900 GB/s and not closer to 1200 GB/s? I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc. * The speed-up seems to be 12 out of 64, provided multiples of 8 cores are used. As expected given 12 memory channels? Maybe not, otherwise the speedup should be 24 as you have 24 channels. * Does the zig-zag pattern indicate a pinning problem, or is it unavoidable given the 8 core building block of these type of processors? I checked and found "make mpistream" uses --map-by core. I think we should use --map-by socket or --map-by l3cache. Chris [cid:ii_199f2a38566119b24a61] dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!bqDJffrr9mQhcYcNK2URWAz71Tks15aiJt7CAtN6yyLdHq8UoaJM_3qbJQnYBLO1ex08X6mU0GgYmJFwaUY3YwY$ [Facebook] [LinkedIn] [YouTube] -------------- next part -------------- A non-text attachment was scrubbed... Name: image014564.png Type: image/png Size: 5004 bytes Desc: image014564.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image076541.png Type: image/png Size: 487 bytes Desc: image076541.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image600589.png Type: image/png Size: 504 bytes Desc: image600589.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image758659.png Type: image/png Size: 482 bytes Desc: image758659.png URL: From bsmith at petsc.dev Tue Oct 21 09:35:24 2025 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 21 Oct 2025 10:35:24 -0400 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: <2591b65a8cce45969fd80150db47c01f@cea.fr> References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr> <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> <561C2914-04D1-48D0-8BC4-E5F40FEB1C05@petsc.dev> <2591b65a8cce45969fd80150db47c01f@cea.fr> Message-ID: That is clearly a dramatic amount! Of course, the previous code was absurd, copying all the nonzero entries to the host, finding the diagonal entries, and then copying them back to the GPU. If, through Nsight, you find other similar performance bottlenecks, please let us know, and I can try to resolve them. Barry > On Oct 21, 2025, at 5:55?AM, LEDAC Pierre wrote: > > Hello, > > Thanks for the work ! > It is ok now, i check with Nsight system, the diagonal is indeed computed on the device. > > How much time it saves ? I guess it depends of the number of iterations for Gmres, the lower, the more it is significant. > In my case, with 5 158 400 rows for the matrix, 45 iterations of GMRES, time to solve decrease 1.160s from to 0.671s > on a RTX A6000. > > So thanks again, > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?41 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > De : Barry Smith > Envoy? : vendredi 17 octobre 2025 23:27:19 > ? : LEDAC Pierre > Cc : Junchao Zhang; petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] [GPU] Jacobi preconditioner > > > I have updated the MR with what I think is now correct code for computing the diagonal on the GPU, could you please try it again and let me know if it works and how much time it saves (I think it is should be significant). > > Thankts for your patients, > > Barry > > >> On Oct 2, 2025, at 1:16?AM, LEDAC Pierre wrote: >> >> Yes, probably the reason I saw also a crash in my test case after a quick fix of the integer conversion. >> >> Pierre LEDAC >> Commissariat ? l??nergie atomique et aux ?nergies alternatives >> Centre de SACLAY >> DES/ISAS/DM2S/SGLS/LCAN >> B?timent 451 ? point courrier n?41 >> F-91191 Gif-sur-Yvette >> +33 1 69 08 04 03 >> +33 6 83 42 05 79 >> De : Barry Smith > >> Envoy? : jeudi 2 octobre 2025 02:16:40 >> ? : LEDAC Pierre >> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >> >> >> Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. >> >> Barry >> >> >>> On Oct 1, 2025, at 3:47?PM, LEDAC Pierre > wrote: >>> >>> Sorry the correct error is: >>> >>> /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" >>> GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); >>> >>> >>> Pierre LEDAC >>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>> Centre de SACLAY >>> DES/ISAS/DM2S/SGLS/LCAN >>> B?timent 451 ? point courrier n?41 >>> F-91191 Gif-sur-Yvette >>> +33 1 69 08 04 03 >>> +33 6 83 42 05 79 >>> De : LEDAC Pierre >>> Envoy? : mercredi 1 octobre 2025 21:46:00 >>> ? : Barry Smith >>> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >>> Objet : RE: [petsc-users] [GPU] Jacobi preconditioner >>> >>> Hi all, >>> >>> Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: >>> >>> /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" >>> GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); >>> >>> Thanks, >>> >>> Pierre LEDAC >>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>> Centre de SACLAY >>> DES/ISAS/DM2S/SGLS/LCAN >>> B?timent 451 ? point courrier n?41 >>> F-91191 Gif-sur-Yvette >>> +33 1 69 08 04 03 >>> +33 6 83 42 05 79 >>> De : Barry Smith > >>> Envoy? : mercredi 1 octobre 2025 18:48:37 >>> ? : LEDAC Pierre >>> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >>> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >>> >>> >>> I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!a4GaRrrHRLjv08cxI4Uh5a74H5DcewdUJ8zWt0i8SiUDMqKoYa1z38JoxkRS12NV5Jrj_3N9wZbLSTA2N55nv1o$ >>> >>> Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. >>> >>> Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. >>> >>> Barry >>> >>>> On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: >>>> >>>> Thanks Barry, I agree but didn't dare asking for that. >>>> >>>> Pierre LEDAC >>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>> Centre de SACLAY >>>> DES/ISAS/DM2S/SGLS/LCAN >>>> B?timent 451 ? point courrier n?41 >>>> F-91191 Gif-sur-Yvette >>>> +33 1 69 08 04 03 >>>> +33 6 83 42 05 79 >>>> >>>> De : Barry Smith > >>>> Envoy? : mercredi 30 juillet 2025 20:34:26 >>>> ? : Junchao Zhang >>>> Cc : LEDAC Pierre; petsc-users at mcs.anl.gov >>>> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >>>> >>>> >>>> We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. >>>> >>>> I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!a4GaRrrHRLjv08cxI4Uh5a74H5DcewdUJ8zWt0i8SiUDMqKoYa1z38JoxkRS12NV5Jrj_3N9wZbLSTA2vc2y51A$ >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>>> On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: >>>>> >>>>> Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. >>>>> If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: >>>>>> Hello all, >>>>>> >>>>>> We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). >>>>>> >>>>>> The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. >>>>>> >>>>>> It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). >>>>>> >>>>>> Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? >>>>>> NB: Gmres is running well on device. >>>>>> >>>>>> I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Pierre LEDAC >>>>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>>>> Centre de SACLAY >>>>>> DES/ISAS/DM2S/SGLS/LCAN >>>>>> B?timent 451 ? point courrier n?41 >>>>>> F-91191 Gif-sur-Yvette >>>>>> +33 1 69 08 04 03 >>>>>> +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.sassmannshausen at imperial.ac.uk Tue Oct 21 09:00:26 2025 From: j.sassmannshausen at imperial.ac.uk (=?UTF-8?B?SsO2cmcgU2HDn21hbm5zaGF1c2Vu?=) Date: Tue, 21 Oct 2025 15:00:26 +0100 Subject: [petsc-users] source download with EasyBuild fails Message-ID: <0fa30f9d-e4d5-41d1-9dfd-d4b7bbd4f92a@imperial.ac.uk> Dear all, we are using EasyBuild for our software installation and as part of the installation process, EasyBuild is downloading the source-package automatically, and checks the checksum of the downloaded package. However, that seems to be broken for some time now, as explained in this issue: https://github.com/easybuilders/easybuild-framework/issues/4925 In short, EasyBuild is using the http-header: headers = {'User-Agent': 'EasyBuild', 'Accept': '*/*'} which seems to be blocked. I was wondering if it is possible to add an exception to that rule so the automated builds are working again. I personally would prefer this route, simply as it also allows you to get some statistics about how often the software was downloaded and installed via EasyBuild. So for me, that would be a win-win situation. Please let me know. Kind regards J?rg -- Dr. J?rg Sa?mannshausen, MRSC (he/him) Senior Research Computing Analyst Information and Communication Technologies Imperial College London White City Campus Level 1 Mediaworks London, W12 7FP -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0x53CD8701F5C08E39.asc Type: application/pgp-keys Size: 3610 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 236 bytes Desc: OpenPGP digital signature URL: From donghui.xu at pnnl.gov Tue Oct 21 09:46:24 2025 From: donghui.xu at pnnl.gov (Xu, Donghui) Date: Tue, 21 Oct 2025 14:46:24 +0000 Subject: [petsc-users] How to map global vector to natural vector Message-ID: Dear PETSc Team, I am working with petsc4py for my model. I had some experience of using PETSc in Fortran. In Fortran, I used the following subroutines: call DMPlexCreateNaturalVector(dm, natural, ierr) call DMPlexNaturalToGlobalBegin(dm,natural,X,ierr) call DMPlexNaturalToGlobalEnd(dm,natural,X,ierr) However, I found there are no such interfaces in petsc4py. Can you advise me on how to get the global vector in natural order with DMPLEX in petsc4py? Thanks, Donghui -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Tue Oct 21 11:10:29 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Tue, 21 Oct 2025 11:10:29 -0500 (CDT) Subject: [petsc-users] source download with EasyBuild fails In-Reply-To: <0fa30f9d-e4d5-41d1-9dfd-d4b7bbd4f92a@imperial.ac.uk> References: <0fa30f9d-e4d5-41d1-9dfd-d4b7bbd4f92a@imperial.ac.uk> Message-ID: follow-up at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/issues/1819__;!!G_uCfscf7eWS!ZWm7ja8TtsyqvaXBnvzw4Ro-dKU1cSBpCFgjdvnh1Q3Rf17oUTB9lOpiFnsrUwpKyuUBVBRhTniTOh9LT0ykXJPBtdo$ Satish On Tue, 21 Oct 2025, J?rg Sa?mannshausen wrote: > Dear all, > > we are using EasyBuild for our software installation and as part of the > installation process, EasyBuild is downloading the source-package > automatically, and checks the checksum of the downloaded package. > > However, that seems to be broken for some time now, as explained in this > issue: > https://urldefense.us/v3/__https://github.com/easybuilders/easybuild-framework/issues/4925__;!!G_uCfscf7eWS!ZWm7ja8TtsyqvaXBnvzw4Ro-dKU1cSBpCFgjdvnh1Q3Rf17oUTB9lOpiFnsrUwpKyuUBVBRhTniTOh9LT0yk3l4pWZU$ > > In short, EasyBuild is using the http-header: > headers = {'User-Agent': 'EasyBuild', 'Accept': '*/*'} > > which seems to be blocked. > > I was wondering if it is possible to add an exception to that rule so the > automated builds are working again. > I personally would prefer this route, simply as it also allows you to get some > statistics about how often the software was downloaded and installed via > EasyBuild. So for me, that would be a win-win situation. > > Please let me know. > > Kind regards > > J?rg > > From knepley at gmail.com Tue Oct 21 15:24:12 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 21 Oct 2025 16:24:12 -0400 Subject: [petsc-users] How to map global vector to natural vector In-Reply-To: References: Message-ID: I will fix it. Thanks, Matt On Tue, Oct 21, 2025 at 12:09?PM Xu, Donghui via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc Team, > > I am working with petsc4py for my model. I had some experience of using > PETSc in Fortran. In Fortran, I used the following subroutines: > > call DMPlexCreateNaturalVector(dm, natural, ierr) > call DMPlexNaturalToGlobalBegin(dm,natural,X,ierr) > call DMPlexNaturalToGlobalEnd(dm,natural,X,ierr) > > However, I found there are no such interfaces in petsc4py. Can you advise > me on how to get the global vector in natural order with DMPLEX in petsc4py? > > Thanks, > Donghui > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YPaxLUvh_DJWIDUmqUgzeBLwP3fZLTnd1_Xd0BJ9SjKAjwWllNdUb7sj33kgPwnIUHwjfhmzRnKelSbWjpPT$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Oct 21 16:17:58 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 21 Oct 2025 16:17:58 -0500 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: Hi, Chris, I think I am done with the MR, https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!YrTW8A4OcJU-ZdMQHgpISTnTkIOdDgvE9JdeWugUHYhynmyVAiRsbC2alT9pMknGJMkb559Bgu3olNqbiDcYp3catWNW$ You can look at the sample output there. The array size is now very large, supporting an aggregated L3 cache size of 1,920MB. --Junchao Zhang On Tue, Oct 21, 2025 at 6:17?AM Klaij, Christiaan wrote: > OK, experiments will have to wait till we get the hardware. > > Can you give me a sign when you are done with the merge request? I > would like to try with the increased array size, other vendors > already warned me that "the array in stream is quiet small". > > Chris > > ________________________________________ > From: Junchao Zhang > Sent: Monday, October 20, 2025 6:36 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > Hi, Chris, > Since we compute the speed up off the bandwidth achieved by a single MPI > process, and a process can drive all memory channels, the maximum speed up > can only come from experiments (vs. not by # of memory channels). > > --Junchao Zhang > > > On Mon, Oct 20, 2025 at 9:45?AM Klaij, Christiaan > wrote: > Hi Junchao, > > Thanks for you answer. Regarding the speed-up what would you expect if not > 24 out of 64, and why? > > Chris > > ________________________________________ > [cid:ii_19a027041d3d825dd561] > > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!YrTW8A4OcJU-ZdMQHgpISTnTkIOdDgvE9JdeWugUHYhynmyVAiRsbC2alT9pMknGJMkb559Bgu3olNqbiDcYp84_t49H$ > [Facebook] > [LinkedIn] > [YouTube] > > From: Junchao Zhang junchao.zhang at gmail.com>> > Sent: Friday, October 17, 2025 5:01 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > Hi, Chris, > I did have an MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!YrTW8A4OcJU-ZdMQHgpISTnTkIOdDgvE9JdeWugUHYhynmyVAiRsbC2alT9pMknGJMkb559Bgu3olNqbiDcYp3catWNW$ to > improve mpistream. I should rework it after Barry's !6903. See my inlined > comments to your questions > > On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users < > petsc-users at mcs.anl.gov petsc-users at mcs.anl.gov>> wrote: > Attached is a petsc streams result kindly provided by a hardware > vendor for a single compute node, dual socket, with two AMD epyc > 9355 processors. Each processor has 32 cores, 12 DDR5 memory > channels and mem BW around 600 GB/s. > > * It is not immediately clear which line corresponds to which > y-axis. Could future versions of petsc please color the axis > label with the matching line color? > definitely > > > * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = > 900 GB/s and not closer to 1200 GB/s? > I recall it is actually not simple to get the theoretical max bandwidth. > One has to use special SIMD instructions, compiler flags and streaming > stores etc. > > > * The speed-up seems to be 12 out of 64, provided multiples of 8 > cores are used. As expected given 12 memory channels? > Maybe not, otherwise the speedup should be 24 as you have 24 channels. > > > * Does the zig-zag pattern indicate a pinning problem, or is it > unavoidable given the 8 core building block of these type of > processors? > I checked and found "make mpistream" uses --map-by core. I think we should > use --map-by socket or --map-by l3cache. > > > Chris > [cid:ii_199f2a38566119b24a61] > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!YrTW8A4OcJU-ZdMQHgpISTnTkIOdDgvE9JdeWugUHYhynmyVAiRsbC2alT9pMknGJMkb559Bgu3olNqbiDcYp84_t49H$ < > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!YrTW8A4OcJU-ZdMQHgpISTnTkIOdDgvE9JdeWugUHYhynmyVAiRsbC2alT9pMknGJMkb559Bgu3olNqbiDcYp84_t49H$ >< > https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$ > > > [Facebook]< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$ > > > [LinkedIn]< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$ > > > [YouTube]< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.LEDAC at cea.fr Wed Oct 22 01:55:08 2025 From: Pierre.LEDAC at cea.fr (LEDAC Pierre) Date: Wed, 22 Oct 2025 06:55:08 +0000 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr> <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> <561C2914-04D1-48D0-8BC4-E5F40FEB1C05@petsc.dev> <2591b65a8cce45969fd80150db47c01f@cea.fr>, Message-ID: Barry, We are currently using more and more GPU computations and heavily relying on PETSc solvers (through boomeramg, amgx, gamg preconditioners) so yes we will report you any issues or bottlenecks. This leads to my next question: any hope, one day, of a MatGetDiagonal_SeqAIJHIPSPARSE implementation ? We know that there is Kokkos backend as a workaround though. Thanks again, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith Envoy? : mardi 21 octobre 2025 16:35:24 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov; BOURGEOIS R?mi Objet : Re: [petsc-users] [GPU] Jacobi preconditioner That is clearly a dramatic amount! Of course, the previous code was absurd, copying all the nonzero entries to the host, finding the diagonal entries, and then copying them back to the GPU. If, through Nsight, you find other similar performance bottlenecks, please let us know, and I can try to resolve them. Barry On Oct 21, 2025, at 5:55?AM, LEDAC Pierre wrote: Hello, Thanks for the work ! It is ok now, i check with Nsight system, the diagonal is indeed computed on the device. How much time it saves ? I guess it depends of the number of iterations for Gmres, the lower, the more it is significant. In my case, with 5 158 400 rows for the matrix, 45 iterations of GMRES, time to solve decrease 1.160s from to 0.671s on a RTX A6000. So thanks again, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith Envoy? : vendredi 17 octobre 2025 23:27:19 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have updated the MR with what I think is now correct code for computing the diagonal on the GPU, could you please try it again and let me know if it works and how much time it saves (I think it is should be significant). Thankts for your patients, Barry On Oct 2, 2025, at 1:16?AM, LEDAC Pierre wrote: Yes, probably the reason I saw also a crash in my test case after a quick fix of the integer conversion. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : jeudi 2 octobre 2025 02:16:40 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. Barry On Oct 1, 2025, at 3:47?PM, LEDAC Pierre > wrote: Sorry the correct error is: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : LEDAC Pierre Envoy? : mercredi 1 octobre 2025 21:46:00 ? : Barry Smith Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : RE: [petsc-users] [GPU] Jacobi preconditioner Hi all, Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 1 octobre 2025 18:48:37 ? : LEDAC Pierre Cc : Junchao Zhang; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!Z_bBmSMgZlgbbCnnU95NebK6rq-HthAoBot_aRjN3FrswnW_hUKhHtQEhzaSBDufQF4zdJmuu48OyETmlFE96TeqxaLz$ Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. Barry On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: Thanks Barry, I agree but didn't dare asking for that. Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 ________________________________ De : Barry Smith > Envoy? : mercredi 30 juillet 2025 20:34:26 ? : Junchao Zhang Cc : LEDAC Pierre; petsc-users at mcs.anl.gov Objet : Re: [petsc-users] [GPU] Jacobi preconditioner We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!Z_bBmSMgZlgbbCnnU95NebK6rq-HthAoBot_aRjN3FrswnW_hUKhHtQEhzaSBDufQF4zdJmuu48OyETmlFE96WMfFt-W$ Barry On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. --Junchao Zhang On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: Hello all, We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? NB: Gmres is running well on device. I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. Thanks, Pierre LEDAC Commissariat ? l??nergie atomique et aux ?nergies alternatives Centre de SACLAY DES/ISAS/DM2S/SGLS/LCAN B?timent 451 ? point courrier n?41 F-91191 Gif-sur-Yvette +33 1 69 08 04 03 +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Oct 22 08:27:07 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 22 Oct 2025 09:27:07 -0400 Subject: [petsc-users] [GPU] Jacobi preconditioner In-Reply-To: References: <386853b1efae4269919b977b88c7e679@cea.fr> <49396000-D752-4C95-AF1B-524EC68BC5BC@petsc.dev> <99f1b933bd7a40c0ab8b946b99f8c944@cea.fr> <79361faf1a834649a802772418106a78@cea.fr> <48fcd36fec154100b888af547764ef20@cea.fr> <787C8A0B-DD74-4B98-8939-C75CC17B22F0@petsc.dev> <561C2914-04D1-48D0-8BC4-E5F40FEB1C05@petsc.dev> <2591b65a8cce45969fd80150db47c01f@cea.fr> Message-ID: > On Oct 22, 2025, at 2:55?AM, LEDAC Pierre wrote: > > Barry, > > We are currently using more and more GPU computations and heavily relying on PETSc solvers (through > boomeramg, amgx, gamg preconditioners) so yes we will report you any issues or bottlenecks. > > This leads to my next question: any hope, one day, of a MatGetDiagonal_SeqAIJHIPSPARSE implementation ? I'll put that on my list. I should be able to get something working in a few days. Thanks for the reminder. Barry > > We know that there is Kokkos backend as a workaround though. > > Thanks again, > > Pierre LEDAC > Commissariat ? l??nergie atomique et aux ?nergies alternatives > Centre de SACLAY > DES/ISAS/DM2S/SGLS/LCAN > B?timent 451 ? point courrier n?41 > F-91191 Gif-sur-Yvette > +33 1 69 08 04 03 > +33 6 83 42 05 79 > De : Barry Smith > > Envoy? : mardi 21 octobre 2025 16:35:24 > ? : LEDAC Pierre > Cc : Junchao Zhang; petsc-users at mcs.anl.gov ; BOURGEOIS R?mi > Objet : Re: [petsc-users] [GPU] Jacobi preconditioner > > > That is clearly a dramatic amount! Of course, the previous code was absurd, copying all the nonzero entries to the host, finding the diagonal entries, and then copying them back to the GPU. > > If, through Nsight, you find other similar performance bottlenecks, please let us know, and I can try to resolve them. > > Barry > > >> On Oct 21, 2025, at 5:55?AM, LEDAC Pierre > wrote: >> >> Hello, >> >> Thanks for the work ! >> It is ok now, i check with Nsight system, the diagonal is indeed computed on the device. >> >> How much time it saves ? I guess it depends of the number of iterations for Gmres, the lower, the more it is significant. >> In my case, with 5 158 400 rows for the matrix, 45 iterations of GMRES, time to solve decrease 1.160s from to 0.671s >> on a RTX A6000. >> >> So thanks again, >> >> Pierre LEDAC >> Commissariat ? l??nergie atomique et aux ?nergies alternatives >> Centre de SACLAY >> DES/ISAS/DM2S/SGLS/LCAN >> B?timent 451 ? point courrier n?41 >> F-91191 Gif-sur-Yvette >> +33 1 69 08 04 03 >> +33 6 83 42 05 79 >> De : Barry Smith > >> Envoy? : vendredi 17 octobre 2025 23:27:19 >> ? : LEDAC Pierre >> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >> >> >> I have updated the MR with what I think is now correct code for computing the diagonal on the GPU, could you please try it again and let me know if it works and how much time it saves (I think it is should be significant). >> >> Thankts for your patients, >> >> Barry >> >> >>> On Oct 2, 2025, at 1:16?AM, LEDAC Pierre > wrote: >>> >>> Yes, probably the reason I saw also a crash in my test case after a quick fix of the integer conversion. >>> >>> Pierre LEDAC >>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>> Centre de SACLAY >>> DES/ISAS/DM2S/SGLS/LCAN >>> B?timent 451 ? point courrier n?41 >>> F-91191 Gif-sur-Yvette >>> +33 1 69 08 04 03 >>> +33 6 83 42 05 79 >>> >>> De : Barry Smith > >>> Envoy? : jeudi 2 octobre 2025 02:16:40 >>> ? : LEDAC Pierre >>> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >>> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >>> >>> >>> Sorry about that. The current code is buggy anyways; I will let you know when I have tested it extensively so you can try again. >>> >>> Barry >>> >>> >>>> On Oct 1, 2025, at 3:47?PM, LEDAC Pierre > wrote: >>>> >>>> Sorry the correct error is: >>>> >>>> /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "int*" is incompatible with parameter of type "const PetscInt *" >>>> GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); >>>> >>>> >>>> Pierre LEDAC >>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>> Centre de SACLAY >>>> DES/ISAS/DM2S/SGLS/LCAN >>>> B?timent 451 ? point courrier n?41 >>>> F-91191 Gif-sur-Yvette >>>> +33 1 69 08 04 03 >>>> +33 6 83 42 05 79 >>>> De : LEDAC Pierre >>>> Envoy? : mercredi 1 octobre 2025 21:46:00 >>>> ? : Barry Smith >>>> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >>>> Objet : RE: [petsc-users] [GPU] Jacobi preconditioner >>>> >>>> Hi all, >>>> >>>> Thanks for the MR, there is a build issue cause we use --with-64-bit-indices: >>>> >>>> /export/home/catA/pl254994/trust/petsc/lib/src/LIBPETSC/build/petsc-barry-2025-09-30-add-matgetdiagonal-cuda/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu(3765): error: argument of type "PetscInt" is incompatible with parameter of type "const PetscInt *" >>>> GetDiagonal_CSR<<<(int)((n + 255) / 256), 256, 0, PetscDefaultCudaStream>>>(cusparsestruct->rowoffsets_gpu->data().get(), matstruct->cprowIndices->data().get(), cusparsestruct->workVector->data().get(), n, darray); >>>> >>>> Thanks, >>>> >>>> Pierre LEDAC >>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>> Centre de SACLAY >>>> DES/ISAS/DM2S/SGLS/LCAN >>>> B?timent 451 ? point courrier n?41 >>>> F-91191 Gif-sur-Yvette >>>> +33 1 69 08 04 03 >>>> +33 6 83 42 05 79 >>>> De : Barry Smith > >>>> Envoy? : mercredi 1 octobre 2025 18:48:37 >>>> ? : LEDAC Pierre >>>> Cc : Junchao Zhang; petsc-users at mcs.anl.gov >>>> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >>>> >>>> >>>> I have finally created an MR that moves the Jacobi accessing of the diagonal to the GPU, which should improve the GPU performance of your code. https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8756__;!!G_uCfscf7eWS!Z3vNRk5sR_97xbqL3Cns8okunsbBvctJMySGgbt7k5XRvpmQ2mg2SoVEfyRr96Lw69iLdV1KRBASzeT7a35k-9U$ >>>> >>>> Please give it a try and let us know if it causes any difficulties or, hopefully, improves your code's performance significantly. >>>> >>>> Sorry for the long delay, NVIDIA is hiring too many PETSc developers away from us. >>>> >>>> Barry >>>> >>>>> On Jul 31, 2025, at 6:46?AM, LEDAC Pierre > wrote: >>>>> >>>>> Thanks Barry, I agree but didn't dare asking for that. >>>>> >>>>> Pierre LEDAC >>>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>>> Centre de SACLAY >>>>> DES/ISAS/DM2S/SGLS/LCAN >>>>> B?timent 451 ? point courrier n?41 >>>>> F-91191 Gif-sur-Yvette >>>>> +33 1 69 08 04 03 >>>>> +33 6 83 42 05 79 >>>>> >>>>> De : Barry Smith > >>>>> Envoy? : mercredi 30 juillet 2025 20:34:26 >>>>> ? : Junchao Zhang >>>>> Cc : LEDAC Pierre; petsc-users at mcs.anl.gov >>>>> Objet : Re: [petsc-users] [GPU] Jacobi preconditioner >>>>> >>>>> >>>>> We absolutely should have a MatGetDiagonal_SeqAIJCUSPARSE(). It's somewhat embarrassing that we don't provide this. >>>>> >>>>> I have found some potential code at https://urldefense.us/v3/__https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse__;!!G_uCfscf7eWS!Z3vNRk5sR_97xbqL3Cns8okunsbBvctJMySGgbt7k5XRvpmQ2mg2SoVEfyRr96Lw69iLdV1KRBASzeT7a3YPyyk$ >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>>> On Jul 28, 2025, at 11:43?AM, Junchao Zhang > wrote: >>>>>> >>>>>> Yes, MatGetDiagonal_SeqAIJCUSPARSE hasn't been implemented. petsc/cuda and petsc/kokkos backends are separate code. >>>>>> If petsc/kokkos meet your needs, then just use them. For petsc users, we hope it will be just a difference of extra --download-kokkos --download-kokkos-kernels in configuration. >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Mon, Jul 28, 2025 at 2:51?AM LEDAC Pierre > wrote: >>>>>>> Hello all, >>>>>>> >>>>>>> We are solving with PETSc a linear system updated every time step (constant stencil but coefficients changing). >>>>>>> >>>>>>> The matrix is preallocated once with MatSetPreallocationCOO() then filled each time step with MatSetValuesCOO() and we use device pointers for coo_i, coo_j, and coefficients values. >>>>>>> >>>>>>> It is working fine with a GMRES Ksp solver and PC Jacobi but we are surprised to see that every time step, during PCSetUp, MatGetDiagonal_SeqAIJ is called whereas the matrix is on the device. Looking at the API, it seems there is no MatGetDiagonal_SeqAIJCUSPARSE() but a MatGetDiagonal_SeqAIJKOKKOS(). >>>>>>> >>>>>>> Does it mean we should use Kokkos backend in PETSc to have Jacobi preconditioner built directly on device ? Or I am doing something wrong ? >>>>>>> NB: Gmres is running well on device. >>>>>>> >>>>>>> I could use -ksp_reuse_preconditioner to avoid Jacobi being recreated each solve on host but it increases significantly the number of iterations. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Pierre LEDAC >>>>>>> Commissariat ? l??nergie atomique et aux ?nergies alternatives >>>>>>> Centre de SACLAY >>>>>>> DES/ISAS/DM2S/SGLS/LCAN >>>>>>> B?timent 451 ? point courrier n?41 >>>>>>> F-91191 Gif-sur-Yvette >>>>>>> +33 1 69 08 04 03 >>>>>>> +33 6 83 42 05 79 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ross.Williams at glasgow.ac.uk Wed Oct 22 07:54:08 2025 From: Ross.Williams at glasgow.ac.uk (Ross Williams) Date: Wed, 22 Oct 2025 12:54:08 +0000 Subject: [petsc-users] Recommended workflow for continuing a TSSolve after making a change to mesh topology Message-ID: Hello, I am asking for guidance and/or clarification on how one should best handle changes to the mesh topology within TSSetPreStep of a TSSolve. Out current methodology after making the change is: 1. TSReset(ts); 2. DMSetUp(dm); 3. TSSetSolution(ts, sol); 4. TSSetDM(ts, dm); 5. TSSetUp(ts); I have not found any relevant examples in the documentation so if there is one or an example in another code base, please feel free to direct me towards it. Kind regards, Ross. ________________________________ Ross Williams PhD Research Associate Glasgow Computational Engineering Centre (GCEC) E: ross.williams at glasgow.ac.uk W: https://urldefense.us/v3/__https://www.gla.ac.uk/schools/engineering/staff/rosswilliams/__;!!G_uCfscf7eWS!dIt7_WjFSNIz80F0_FJan8iSDDhDRHEYaqhuBa0vDHLsWHctjh82s3y-z2TLy8E7XoIghMWJ4RMxZDJzoSn2XhKxkoKYazsSh50$ Pearce Lodge, James Watt School of Engineering, University of Glasgow, G12 8QQ, Glasgow, United Kingdom. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Wed Oct 22 10:09:05 2025 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Wed, 22 Oct 2025 17:09:05 +0200 Subject: [petsc-users] Recommended workflow for continuing a TSSolve after making a change to mesh topology In-Reply-To: References: Message-ID: You can use TSSetResize, see https://urldefense.us/v3/__https://petsc.org/release/manualpages/TS/TSSetResize/__;!!G_uCfscf7eWS!bKPHXAQZ-u_toOCSRqHnaw4ViCwJhsm3ERuCVrQrcwQ8FVPgv9MoRun-H4K1_bCnB23anEzC8X35rfKJ0STCq-3Dgd_YIAM$ see https://urldefense.us/v3/__https://petsc.org/release/src/ts/tutorials/ex11.c.html__;!!G_uCfscf7eWS!bKPHXAQZ-u_toOCSRqHnaw4ViCwJhsm3ERuCVrQrcwQ8FVPgv9MoRun-H4K1_bCnB23anEzC8X35rfKJ0STCq-3DyPOLKps$ or https://urldefense.us/v3/__https://petsc.org/release/src/ts/tutorials/ex45.c.html__;!!G_uCfscf7eWS!bKPHXAQZ-u_toOCSRqHnaw4ViCwJhsm3ERuCVrQrcwQ8FVPgv9MoRun-H4K1_bCnB23anEzC8X35rfKJ0STCq-3DuPkWW0o$ for examples that use a DM Il giorno mer 22 ott 2025 alle ore 15:40 Ross Williams < Ross.Williams at glasgow.ac.uk> ha scritto: > Hello, > > I am asking for guidance and/or clarification on how one should best > handle changes to the mesh topology within TSSetPreStep of a TSSolve. > > Out current methodology after making the change is: > > 1. TSReset(ts); > 2. DMSetUp(dm); > 3. TSSetSolution(ts, sol); > 4. TSSetDM(ts, dm); > 5. TSSetUp(ts); > > > I have not found any relevant examples in the documentation so if there is > one or an example in another code base, please feel free to direct me > towards it. > > Kind regards, > Ross. > > ------------------------------ > Ross Williams PhD > > Research Associate > > Glasgow Computational Engineering Centre (GCEC) > > E: ross.williams at glasgow.ac.uk > > W: https://urldefense.us/v3/__https://www.gla.ac.uk/schools/engineering/staff/rosswilliams/__;!!G_uCfscf7eWS!bKPHXAQZ-u_toOCSRqHnaw4ViCwJhsm3ERuCVrQrcwQ8FVPgv9MoRun-H4K1_bCnB23anEzC8X35rfKJ0STCq-3DZn5ycuo$ > > > Pearce Lodge, > > James Watt School of Engineering, > > University of Glasgow, > > G12 8QQ, > > Glasgow, > > United Kingdom. > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Oct 23 02:42:39 2025 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 23 Oct 2025 07:42:39 +0000 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: Thanks Junchao, that looks good. The zig zag pattern is visible but not as wild. The speed up is similar. Chris ________________________________________ From: Junchao Zhang Sent: Tuesday, October 21, 2025 11:17 PM To: Klaij, Christiaan Cc: PETSc users list Subject: Re: [petsc-users] interpreting petsc streams result Hi, Chris, I think I am done with the MR, https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7m4wDWgDU$ You can look at the sample output there. The array size is now very large, supporting an aggregated L3 cache size of 1,920MB. --Junchao Zhang On Tue, Oct 21, 2025 at 6:17?AM Klaij, Christiaan > wrote: OK, experiments will have to wait till we get the hardware. Can you give me a sign when you are done with the merge request? I would like to try with the increased array size, other vendors already warned me that "the array in stream is quiet small". Chris ________________________________________ From: Junchao Zhang > Sent: Monday, October 20, 2025 6:36 PM To: Klaij, Christiaan Cc: PETSc users list Subject: Re: [petsc-users] interpreting petsc streams result Hi, Chris, Since we compute the speed up off the bandwidth achieved by a single MPI process, and a process can drive all memory channels, the maximum speed up can only come from experiments (vs. not by # of memory channels). --Junchao Zhang On Mon, Oct 20, 2025 at 9:45?AM Klaij, Christiaan >> wrote: Hi Junchao, Thanks for you answer. Regarding the speed-up what would you expect if not 24 out of 64, and why? Chris ________________________________________ [cid:ii_19a027041d3d825dd561] dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ [Facebook] [LinkedIn] [YouTube] From: Junchao Zhang >> Sent: Friday, October 17, 2025 5:01 PM To: Klaij, Christiaan Cc: PETSc users list Subject: Re: [petsc-users] interpreting petsc streams result Hi, Chris, I did have an MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7m4wDWgDU$ to improve mpistream. I should rework it after Barry's !6903. See my inlined comments to your questions On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users >>>> wrote: Attached is a petsc streams result kindly provided by a hardware vendor for a single compute node, dual socket, with two AMD epyc 9355 processors. Each processor has 32 cores, 12 DDR5 memory channels and mem BW around 600 GB/s. * It is not immediately clear which line corresponds to which y-axis. Could future versions of petsc please color the axis label with the matching line color? definitely * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = 900 GB/s and not closer to 1200 GB/s? I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc. * The speed-up seems to be 12 out of 64, provided multiples of 8 cores are used. As expected given 12 memory channels? Maybe not, otherwise the speedup should be 24 as you have 24 channels. * Does the zig-zag pattern indicate a pinning problem, or is it unavoidable given the 8 core building block of these type of processors? I checked and found "make mpistream" uses --map-by core. I think we should use --map-by socket or --map-by l3cache. Chris [cid:ii_199f2a38566119b24a61] dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ [Facebook] [LinkedIn] [YouTube] From knepley at gmail.com Thu Oct 23 10:17:02 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Oct 2025 11:17:02 -0400 Subject: [petsc-users] interpreting petsc streams result In-Reply-To: References: Message-ID: On Thu, Oct 23, 2025 at 3:42?AM Klaij, Christiaan via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thanks Junchao, that looks good. The zig zag pattern is visible but not as > wild. The speed up is similar. > We got a similar zigzag on some of the LLNL machines. I think it is a feature of modern memory architectures. Thanks, Matt > Chris > > ________________________________________ > From: Junchao Zhang > Sent: Tuesday, October 21, 2025 11:17 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > > Hi, Chris, > I think I am done with the MR, > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7m4wDWgDU$ > You can look at the sample output there. The array size is now very > large, supporting an aggregated L3 cache size of 1,920MB. > > --Junchao Zhang > > > On Tue, Oct 21, 2025 at 6:17?AM Klaij, Christiaan > wrote: > OK, experiments will have to wait till we get the hardware. > > Can you give me a sign when you are done with the merge request? I > would like to try with the increased array size, other vendors > already warned me that "the array in stream is quiet small". > > Chris > > ________________________________________ > From: Junchao Zhang junchao.zhang at gmail.com>> > Sent: Monday, October 20, 2025 6:36 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > Hi, Chris, > Since we compute the speed up off the bandwidth achieved by a single MPI > process, and a process can drive all memory channels, the maximum speed up > can only come from experiments (vs. not by # of memory channels). > > --Junchao Zhang > > > On Mon, Oct 20, 2025 at 9:45?AM Klaij, Christiaan >> > wrote: > Hi Junchao, > > Thanks for you answer. Regarding the speed-up what would you expect if not > 24 out of 64, and why? > > Chris > > ________________________________________ > [cid:ii_19a027041d3d825dd561] > > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ > < > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ > >< > https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7muq8j-ds$ > > > [Facebook]< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mLqrnypA$ > > > [LinkedIn]< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mzmmkNA0$ > > > [YouTube]< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mZCDzHLA$ > > > > From: Junchao Zhang junchao.zhang at gmail.com> junchao.zhang at gmail.com>>> > Sent: Friday, October 17, 2025 5:01 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > Hi, Chris, > I did have an MR > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7m4wDWgDU$ > to improve mpistream. I should rework it after Barry's !6903. See my > inlined comments to your questions > > On Fri, Oct 17, 2025 at 3:37?AM Klaij, Christiaan via petsc-users < > petsc-users at mcs.anl.gov petsc-users at mcs.anl.gov> petsc-users at mcs.anl.gov petsc-users at mcs.anl.gov>>> wrote: > Attached is a petsc streams result kindly provided by a hardware > vendor for a single compute node, dual socket, with two AMD epyc > 9355 processors. Each processor has 32 cores, 12 DDR5 memory > channels and mem BW around 600 GB/s. > > * It is not immediately clear which line corresponds to which > y-axis. Could future versions of petsc please color the axis > label with the matching line color? > definitely > > > * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = > 900 GB/s and not closer to 1200 GB/s? > I recall it is actually not simple to get the theoretical max bandwidth. > One has to use special SIMD instructions, compiler flags and streaming > stores etc. > > > * The speed-up seems to be 12 out of 64, provided multiples of 8 > cores are used. As expected given 12 memory channels? > Maybe not, otherwise the speedup should be 24 as you have 24 channels. > > > * Does the zig-zag pattern indicate a pinning problem, or is it > unavoidable given the 8 core building block of these type of > processors? > I checked and found "make mpistream" uses --map-by core. I think we should > use --map-by socket or --map-by l3cache. > > > Chris > [cid:ii_199f2a38566119b24a61] > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ > < > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ > >< > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!f0ym-ubSw0zZ557es-25JfsDmPjk4EAhMPYF65uYFefhx7maXr2_xgDoONAvCV6uAJ-WpEtJKblk9W7mdXphyZs$ > >< > https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$ > > > [Facebook]< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$ > > > [LinkedIn]< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$ > > > [YouTube]< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!boS7CONGoGsO-g_0kexht90PXVFByjM2hxfLoPcZujw3MfTTD2rgVTfHepnuDOTfFNns0nI00eSQF6cNS32E$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.schween at mpi-hd.mpg.de Fri Oct 24 06:52:10 2025 From: nils.schween at mpi-hd.mpg.de (Nils Schween) Date: Fri, 24 Oct 2025 13:52:10 +0200 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]GMRES plus BlockJacobi behave differently for seemlingy identical matrices Message-ID: <877bwkms9h.fsf@mpi-hd.mpg.de> Dear PETSc users, Dear PETSc developers, in our software we are solving a linear system with PETSc using GMRES in conjunction with a BlockJacobi preconditioner, i.e. the default of the KSP object. We have two versions of the system matrix, say A and B. The difference between them is the non-zero pattern. The non-zero pattern of matrix B is a subset of the one of matrix A. Their values should be identical. We solve the linear system, using A yields a solution after some iterations, whereas using B does not converge. I created binary files of the two matrices, the right-hand side, and wrote a small PETSc programm, which loads them and demonstrates the issue. I attach the files to this email. We would like to understand why the solver-preconditioner combination works in case A and not in case B. Can you help us finding this out? To test if the two matrices are identical, I substracted them and computed the Frobenius norm of the result. It is zero. To give you more context, we solve a system of partial differential equations that models astrophysical plasmas. It is essentially a system of advection-reaction equations. We use a discontinuous Galerkin (dG) method. Our code relies on the finite element library library deal.ii and its PETSc interface. The system matrices A and B are the result of the (dG) discretisation. We GMRES with a BlockJaboci preconditioner, because we do not know any better. I tested the code I sent with PETSc 3.24.0 and 3.19.1 on my workstation, i.e. Linux home-desktop 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux I use OpenMPI 5.0.8 and I compiled with mpicc, which in my cases use gcc. In case you need more information. Please let me know. Any help is appreciated. Thank you, Nils -------------- next part -------------- A non-text attachment was scrubbed... Name: example.tar.gz Type: application/gzip Size: 189019 bytes Desc: petsc-example URL: -------------- next part -------------- -- Nils Schween Phone: +49 6221 516 557 Mail: nils.schween at mpi-hd.mpg.de PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849 Max Planck Institute for Nuclear Physics Astrophysical Plasma Theory (APT) Saupfercheckweg 1, D-69117 Heidelberg https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5989 bytes Desc: not available URL: From pierre at joliv.et Fri Oct 24 07:51:42 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 24 Oct 2025 14:51:42 +0200 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]GMRES plus BlockJacobi behave differently for seemlingy identical matrices In-Reply-To: <877bwkms9h.fsf@mpi-hd.mpg.de> References: <877bwkms9h.fsf@mpi-hd.mpg.de> Message-ID: > On 24 Oct 2025, at 1:52?PM, Nils Schween wrote: > > Dear PETSc users, Dear PETSc developers, > > in our software we are solving a linear system with PETSc using GMRES > in conjunction with a BlockJacobi preconditioner, i.e. the default of > the KSP object. > > We have two versions of the system matrix, say A and B. The difference > between them is the non-zero pattern. The non-zero pattern of matrix B > is a subset of the one of matrix A. Their values should be identical. > > We solve the linear system, using A yields a solution after some > iterations, whereas using B does not converge. > > I created binary files of the two matrices, the right-hand side, and > wrote a small PETSc programm, which loads them and demonstrates the > issue. I attach the files to this email. > > We would like to understand why the solver-preconditioner combination > works in case A and not in case B. Can you help us finding this out? > > To test if the two matrices are identical, I substracted them and > computed the Frobenius norm of the result. It is zero. The default subdomain solver is ILU(0). By definition, this won?t allow fill-in. So when you are not storing the zeros in B, the quality of your PC is much worse. You can check this yourself with -A_ksp_view -B_ksp_view: [?] 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: (A_) 1 MPI process type: seqaij rows=1664, cols=1664 package used to perform factorization: petsc total: nonzeros=117760, allocated nonzeros=117760 using I-node routines: found 416 nodes, limit used is 5 [?] 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: (B_) 1 MPI process type: seqaij rows=1664, cols=1664 package used to perform factorization: petsc total: nonzeros=49408, allocated nonzeros=49408 not using I-node routines Check the number of nonzeros of both factored Mat. With -B_pc_factor_levels 3, you?ll get roughly similar convergence speed (and density in the factored Mat of both PC). Thanks, Pierre > > To give you more context, we solve a system of partial differential > equations that models astrophysical plasmas. It is essentially a system > of advection-reaction equations. We use a discontinuous Galerkin (dG) > method. Our code relies on the finite element library library deal.ii > and its PETSc interface. The system matrices A and B are the result of > the (dG) discretisation. We GMRES with a BlockJaboci preconditioner, > because we do not know any better. > > I tested the code I sent with PETSc 3.24.0 and 3.19.1 on my workstation, i.e. > Linux home-desktop 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux > I use OpenMPI 5.0.8 and I compiled with mpicc, which in my cases use > gcc. > > In case you need more information. Please let me know. > Any help is appreciated. > > Thank you, > Nils > > > -- > Nils Schween > > Phone: +49 6221 516 557 > Mail: nils.schween at mpi-hd.mpg.de > PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849 > > Max Planck Institute for Nuclear Physics > Astrophysical Plasma Theory (APT) > Saupfercheckweg 1, D-69117 Heidelberg > https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!cyqFGbgowHr6gm-QDawC0b1a6AhpawtUN-FpKiTM6tdOAlamcWYCbhBLhCmS0uCfuDsumQLI95B77FQhrygieQ$ From nils.schween at mpi-hd.mpg.de Fri Oct 24 09:38:13 2025 From: nils.schween at mpi-hd.mpg.de (Nils Schween) Date: Fri, 24 Oct 2025 16:38:13 +0200 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]GMRES plus BlockJacobi behave differently for seemlingy identical matrices In-Reply-To: (Pierre Jolivet's message of "Fri, 24 Oct 2025 14:51:42 +0200") References: <877bwkms9h.fsf@mpi-hd.mpg.de> Message-ID: <87o6pwpdpm.fsf@mpi-hd.mpg.de> Thank you very much Pierre! I was not aware of the fact that the fill-in in the ILU decides about its quality. But its clear now. I will just test what level of fill we need for our application. Once more thanks, Nils Pierre Jolivet writes: >> On 24 Oct 2025, at 1:52?PM, Nils Schween wrote: >> >> Dear PETSc users, Dear PETSc developers, >> >> in our software we are solving a linear system with PETSc using GMRES >> in conjunction with a BlockJacobi preconditioner, i.e. the default of >> the KSP object. >> >> We have two versions of the system matrix, say A and B. The difference >> between them is the non-zero pattern. The non-zero pattern of matrix B >> is a subset of the one of matrix A. Their values should be identical. >> >> We solve the linear system, using A yields a solution after some >> iterations, whereas using B does not converge. >> >> I created binary files of the two matrices, the right-hand side, and >> wrote a small PETSc programm, which loads them and demonstrates the >> issue. I attach the files to this email. >> >> We would like to understand why the solver-preconditioner combination >> works in case A and not in case B. Can you help us finding this out? >> >> To test if the two matrices are identical, I substracted them and >> computed the Frobenius norm of the result. It is zero. > > The default subdomain solver is ILU(0). > By definition, this won?t allow fill-in. > So when you are not storing the zeros in B, the quality of your PC is much worse. > You can check this yourself with -A_ksp_view -B_ksp_view: > [?] > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: (A_) 1 MPI process > type: seqaij > rows=1664, cols=1664 > package used to perform factorization: petsc > total: nonzeros=117760, allocated nonzeros=117760 > using I-node routines: found 416 nodes, limit used is 5 > [?] > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: (B_) 1 MPI process > type: seqaij > rows=1664, cols=1664 > package used to perform factorization: petsc > total: nonzeros=49408, allocated nonzeros=49408 > not using I-node routines > > Check the number of nonzeros of both factored Mat. > With -B_pc_factor_levels 3, you?ll get roughly similar convergence speed (and density in the factored Mat of both PC). > > Thanks, > Pierre > >> >> To give you more context, we solve a system of partial differential >> equations that models astrophysical plasmas. It is essentially a system >> of advection-reaction equations. We use a discontinuous Galerkin (dG) >> method. Our code relies on the finite element library library deal.ii >> and its PETSc interface. The system matrices A and B are the result of >> the (dG) discretisation. We GMRES with a BlockJaboci preconditioner, >> because we do not know any better. >> >> I tested the code I sent with PETSc 3.24.0 and 3.19.1 on my workstation, i.e. >> Linux home-desktop 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux >> I use OpenMPI 5.0.8 and I compiled with mpicc, which in my cases use >> gcc. >> >> In case you need more information. Please let me know. >> Any help is appreciated. >> >> Thank you, >> Nils >> >> >> -- >> Nils Schween >> >> Phone: +49 6221 516 557 >> Mail: nils.schween at mpi-hd.mpg.de >> PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849 >> >> Max Planck Institute for Nuclear Physics >> Astrophysical Plasma Theory (APT) >> Saupfercheckweg 1, D-69117 Heidelberg >> https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt -- Nils Schween PhD Student Phone: +49 6221 516 557 Mail: nils.schween at mpi-hd.mpg.de PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849 Max Planck Institute for Nuclear Physics Astrophysical Plasma Theory (APT) Saupfercheckweg 1, D-69117 Heidelberg https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5989 bytes Desc: not available URL: From pierre at joliv.et Fri Oct 24 10:14:45 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 24 Oct 2025 17:14:45 +0200 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]GMRES plus BlockJacobi behave differently for seemlingy identical matrices In-Reply-To: <87o6pwpdpm.fsf@mpi-hd.mpg.de> References: <877bwkms9h.fsf@mpi-hd.mpg.de> <87o6pwpdpm.fsf@mpi-hd.mpg.de> Message-ID: > On 24 Oct 2025, at 4:38?PM, Nils Schween wrote: > > Thank you very much Pierre! > > I was not aware of the fact that the fill-in in the ILU decides about > its quality. But its clear now. I will just test what level of fill we > need for our application. I?ll note that block Jacobi and ILU are not very efficient solvers in most instances. I tried much fancier algebraic preconditioners such as BoomerAMG and GAMG on your problem, and they are failing hard out-of-the-box. Without knowing much more on the problem, it?s difficult to setup. We also have other more robust preconditioners in PETSc by means of domain decomposition methods. deal.II is interfaced with PCBDDC (which is also somewhat difficult to tune in a fully algebraic mode) and you could also use PCHPDDM (in fully algebraic mode). On this toy problem, PCHPDDM performs much better in terms of iteration than the simpler PCBJACOBI + (sub) PCILU. Of course, as we always advise our users, it?s best to do a little bit of literature survey to find the best method for your application, I doubt it?s PCBJACOBI. If the solver part is not a problem in your application, just carry on with what?s easiest for you. If you want some precise help on either PCBDDC or PCHPDDM, feel free to get in touch with me in private. Thanks, Pierre PCGAMG Linear A_ solve did not converge due to DIVERGED_ITS iterations 1000 Linear B_ solve did not converge due to DIVERGED_ITS iterations 1000 PCHYPRE Linear A_ solve did not converge due to DIVERGED_NANORINF iterations 0 Linear B_ solve did not converge due to DIVERGED_NANORINF iterations 0 PCHPDDM Linear A_ solve converged due to CONVERGED_RTOL iterations 4 Linear B_ solve converged due to CONVERGED_RTOL iterations 38 PCBJACOBI Linear A_ solve converged due to CONVERGED_RTOL iterations 134 Linear B_ solve did not converge due to DIVERGED_ITS iterations 1000 > Once more thanks, > Nils > > > Pierre Jolivet writes: > >>> On 24 Oct 2025, at 1:52?PM, Nils Schween wrote: >>> >>> Dear PETSc users, Dear PETSc developers, >>> >>> in our software we are solving a linear system with PETSc using GMRES >>> in conjunction with a BlockJacobi preconditioner, i.e. the default of >>> the KSP object. >>> >>> We have two versions of the system matrix, say A and B. The difference >>> between them is the non-zero pattern. The non-zero pattern of matrix B >>> is a subset of the one of matrix A. Their values should be identical. >>> >>> We solve the linear system, using A yields a solution after some >>> iterations, whereas using B does not converge. >>> >>> I created binary files of the two matrices, the right-hand side, and >>> wrote a small PETSc programm, which loads them and demonstrates the >>> issue. I attach the files to this email. >>> >>> We would like to understand why the solver-preconditioner combination >>> works in case A and not in case B. Can you help us finding this out? >>> >>> To test if the two matrices are identical, I substracted them and >>> computed the Frobenius norm of the result. It is zero. >> >> The default subdomain solver is ILU(0). >> By definition, this won?t allow fill-in. >> So when you are not storing the zeros in B, the quality of your PC is much worse. >> You can check this yourself with -A_ksp_view -B_ksp_view: >> [?] >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: (A_) 1 MPI process >> type: seqaij >> rows=1664, cols=1664 >> package used to perform factorization: petsc >> total: nonzeros=117760, allocated nonzeros=117760 >> using I-node routines: found 416 nodes, limit used is 5 >> [?] >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: (B_) 1 MPI process >> type: seqaij >> rows=1664, cols=1664 >> package used to perform factorization: petsc >> total: nonzeros=49408, allocated nonzeros=49408 >> not using I-node routines >> >> Check the number of nonzeros of both factored Mat. >> With -B_pc_factor_levels 3, you?ll get roughly similar convergence speed (and density in the factored Mat of both PC). >> >> Thanks, >> Pierre >> >>> >>> To give you more context, we solve a system of partial differential >>> equations that models astrophysical plasmas. It is essentially a system >>> of advection-reaction equations. We use a discontinuous Galerkin (dG) >>> method. Our code relies on the finite element library library deal.ii >>> and its PETSc interface. The system matrices A and B are the result of >>> the (dG) discretisation. We GMRES with a BlockJaboci preconditioner, >>> because we do not know any better. >>> >>> I tested the code I sent with PETSc 3.24.0 and 3.19.1 on my workstation, i.e. >>> Linux home-desktop 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux >>> I use OpenMPI 5.0.8 and I compiled with mpicc, which in my cases use >>> gcc. >>> >>> In case you need more information. Please let me know. >>> Any help is appreciated. >>> >>> Thank you, >>> Nils >>> >>> >>> -- >>> Nils Schween >>> >>> Phone: +49 6221 516 557 >>> Mail: nils.schween at mpi-hd.mpg.de >>> PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849 >>> >>> Max Planck Institute for Nuclear Physics >>> Astrophysical Plasma Theory (APT) >>> Saupfercheckweg 1, D-69117 Heidelberg >>> https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!YoOlZjX4v-hbz0Oaawvh2Yy3nCbpHcafn1VjON06Or7f-WVrzGzD9SMcky5YJAyzVu62BIfzC5cpshSkkkpKpA$ > > -- > Nils Schween > PhD Student > > Phone: +49 6221 516 557 > Mail: nils.schween at mpi-hd.mpg.de > PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849 > > Max Planck Institute for Nuclear Physics > Astrophysical Plasma Theory (APT) > Saupfercheckweg 1, D-69117 Heidelberg > https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!YoOlZjX4v-hbz0Oaawvh2Yy3nCbpHcafn1VjON06Or7f-WVrzGzD9SMcky5YJAyzVu62BIfzC5cpshSkkkpKpA$ From alexandre.scotto at irt-saintexupery.com Mon Oct 27 04:23:50 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Mon, 27 Oct 2025 09:23:50 +0000 Subject: [petsc-users] Options database in petsc4py Message-ID: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> Dear PETSc Community, In my developments, I am managing possibly several KSP solvers with options handled by the Options database. During my tests, I encountered the following behavior: Code: options = PETSc.Options("ksp_") options.setValue("atol", 7e-8) options.view() options.clear() options.view() Output: #PETSc Option Table entries: -ksp_atol 7e-08 # (source: code) #End of PETSc Option Table entries #PETSc Option Table entries: -ksp_atol 7e-08 # (source: code) #End of PETSc Option Table entries It seems that the clear() method does not really clear the Option database. To ensure that the several KSP I deal with are set with their own options (without getting options from a KSP previously set), the only way I found was to explicitly call the delValue() method for all the option keys passed: 1. Iterate over a dictionary of options and use setValue(name, value) 2. Set the KSP with option database: KSP.setFromOptions() 3. Iterate over a the keys of the dictionary and use delValue(name) to effectively clear the option database. Does it seem normal to you, is there something I am missing out? Regards, Alexandre Scotto. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Mon Oct 27 04:35:49 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 27 Oct 2025 10:35:49 +0100 Subject: [petsc-users] Options database in petsc4py In-Reply-To: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> References: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> Message-ID: <7FA72015-5C0B-4C2C-8EA4-4B4A6A05EC89@joliv.et> I would say this is a bug in petsc4py, the following diff should fix this. I?m not sure why this if is there, as it?s perfectly valid to call PetscOptionsClear(NULL). Thanks, Pierre diff --git a/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx b/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx index 4db3c52f022..8a923a6dd8e 100644 --- a/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx +++ b/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx @@ -90,7 +90,6 @@ cdef class Options: def clear(self) -> Self: """Clear an options database.""" - if self.opt == NULL: return CHKERR(PetscOptionsClear(self.opt)) return self > On 27 Oct 2025, at 10:23?AM, SCOTTO Alexandre via petsc-users wrote: > > Dear PETSc Community, > > In my developments, I am managing possibly several KSP solvers with options handled by the Options database. During my tests, I encountered the following behavior: > > Code: > options = PETSc.Options("ksp_") > options.setValue("atol", 7e-8) > options.view() > > options.clear() > options.view() > > Output: > #PETSc Option Table entries: > -ksp_atol 7e-08 # (source: code) > #End of PETSc Option Table entries > > #PETSc Option Table entries: > -ksp_atol 7e-08 # (source: code) > #End of PETSc Option Table entries > > It seems that the clear() method does not really clear the Option database. To ensure that the several KSP I deal with are set with their own options (without getting options from a KSP previously set), the only way I found was to explicitly call the delValue() method for all the option keys passed: > > 1. Iterate over a dictionary of options and use setValue(name, value) > 2. Set the KSP with option database: KSP.setFromOptions() > 3. Iterate over a the keys of the dictionary and use delValue(name) to effectively clear the option database. > > Does it seem normal to you, is there something I am missing out? > > Regards, > Alexandre Scotto. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.scotto at irt-saintexupery.com Mon Oct 27 04:49:30 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Mon, 27 Oct 2025 09:49:30 +0000 Subject: [petsc-users] Options database in petsc4py In-Reply-To: <7FA72015-5C0B-4C2C-8EA4-4B4A6A05EC89@joliv.et> References: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> <7FA72015-5C0B-4C2C-8EA4-4B4A6A05EC89@joliv.et> Message-ID: In if this is a bug, I think the fix you proposed should also be applied to the destroy() method, as it does not clear the database either. Regards, Alexandre. De : Pierre Jolivet Envoy? : lundi 27 octobre 2025 10:36 ? : SCOTTO Alexandre Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Options database in petsc4py I would say this is a bug in petsc4py, the following diff should fix this. I?m not sure why this if is there, as it?s perfectly valid to call PetscOptionsClear(NULL). Thanks, Pierre diff --git a/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx b/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx index 4db3c52f022..8a923a6dd8e 100644 --- a/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx +++ b/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx @@ -90,7 +90,6 @@ cdef class Options: def clear(self) -> Self: """Clear an options database.""" - if self.opt == NULL: return CHKERR(PetscOptionsClear(self.opt)) return self On 27 Oct 2025, at 10:23?AM, SCOTTO Alexandre via petsc-users > wrote: Dear PETSc Community, In my developments, I am managing possibly several KSP solvers with options handled by the Options database. During my tests, I encountered the following behavior: Code: options = PETSc.Options("ksp_") options.setValue("atol", 7e-8) options.view() options.clear() options.view() Output: #PETSc Option Table entries: -ksp_atol 7e-08 # (source: code) #End of PETSc Option Table entries #PETSc Option Table entries: -ksp_atol 7e-08 # (source: code) #End of PETSc Option Table entries It seems that the clear() method does not really clear the Option database. To ensure that the several KSP I deal with are set with their own options (without getting options from a KSP previously set), the only way I found was to explicitly call the delValue() method for all the option keys passed: 1. Iterate over a dictionary of options and use setValue(name, value) 2. Set the KSP with option database: KSP.setFromOptions() 3. Iterate over a the keys of the dictionary and use delValue(name) to effectively clear the option database. Does it seem normal to you, is there something I am missing out? Regards, Alexandre Scotto. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Mon Oct 27 05:08:13 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 27 Oct 2025 11:08:13 +0100 Subject: [petsc-users] Options database in petsc4py In-Reply-To: References: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> <7FA72015-5C0B-4C2C-8EA4-4B4A6A05EC89@joliv.et> Message-ID: <44962795-93F1-49CD-86A5-D68AE3F72F29@joliv.et> > On 27 Oct 2025, at 10:49?AM, SCOTTO Alexandre via petsc-users wrote: > > In if this is a bug, I think the fix you proposed should also be applied to the destroy() method, as it does not clear the database either. Yes, it was under my radar, see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8812/diffs__;!!G_uCfscf7eWS!bpR5xoqEEZODT3ldlupKuT0SLPS13FfAmILqjeIji7MVoNK9BGluc0PWAB3HllK1KvjANwLcl90jUg0rveJhRw$ There is still one check of self.opt == NULL that I?m not sure whether it should be there, but it?s orthogonal to the issue at end. Thanks, Pierre > Regards, > Alexandre. > > De : Pierre Jolivet > Envoy? : lundi 27 octobre 2025 10:36 > ? : SCOTTO Alexandre > Cc : petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] Options database in petsc4py > > I would say this is a bug in petsc4py, the following diff should fix this. > I?m not sure why this if is there, as it?s perfectly valid to call PetscOptionsClear(NULL). > > Thanks, > Pierre > > diff --git a/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx b/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx > index 4db3c52f022..8a923a6dd8e 100644 > --- a/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx > +++ b/src/binding/petsc4py/src/petsc4py/PETSc/Options.pyx > @@ -90,7 +90,6 @@ cdef class Options: > > def clear(self) -> Self: > """Clear an options database.""" > - if self.opt == NULL: return > CHKERR(PetscOptionsClear(self.opt)) > return self > > > On 27 Oct 2025, at 10:23?AM, SCOTTO Alexandre via petsc-users > wrote: > > Dear PETSc Community, > > In my developments, I am managing possibly several KSP solvers with options handled by the Options database. During my tests, I encountered the following behavior: > > Code: > options = PETSc.Options("ksp_") > options.setValue("atol", 7e-8) > options.view() > > options.clear() > options.view() > > Output: > #PETSc Option Table entries: > -ksp_atol 7e-08 # (source: code) > #End of PETSc Option Table entries > > #PETSc Option Table entries: > -ksp_atol 7e-08 # (source: code) > #End of PETSc Option Table entries > > It seems that the clear() method does not really clear the Option database. To ensure that the several KSP I deal with are set with their own options (without getting options from a KSP previously set), the only way I found was to explicitly call the delValue() method for all the option keys passed: > > 1. Iterate over a dictionary of options and use setValue(name, value) > 2. Set the KSP with option database: KSP.setFromOptions() > 3. Iterate over a the keys of the dictionary and use delValue(name) to effectively clear the option database. > > Does it seem normal to you, is there something I am missing out? > > Regards, > Alexandre Scotto. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 27 05:26:42 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Oct 2025 11:26:42 +0100 Subject: [petsc-users] Options database in petsc4py In-Reply-To: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> References: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> Message-ID: On Mon, Oct 27, 2025 at 10:24?AM SCOTTO Alexandre via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc Community, > > > > In my developments, I am managing possibly several KSP solvers with > options handled by the Options database. During my tests, I encountered the > following behavior: > > > > *Code*: > > options = PETSc.Options("ksp_") > > options.setValue("atol", 7e-8) > > options.view() > > > > options.clear() > > options.view() > > > > *Output*: > > #PETSc Option Table entries: > > -ksp_atol 7e-08 # (source: code) > > #End of PETSc Option Table entries > > > > #PETSc Option Table entries: > > -ksp_atol 7e-08 # (source: code) > > #End of PETSc Option Table entries > > > > It seems that the clear() method does not really clear the Option > database. To ensure that the several KSP I deal with are set with their own > options (without getting options from a KSP previously set), the only way I > found was to explicitly call the delValue() method for all the option > keys passed: > > > > 1. Iterate over a dictionary of options and use setValue(name, > value) > > 2. Set the KSP with option database: KSP.setFromOptions() > > 3. Iterate over a the keys of the dictionary and use delValue(name) > to effectively clear the option database. > > > > Does it seem normal to you, is there something I am missing out? > That for pointing out this bug. However, I don't think I would manage options this way. We normally give each separate solver a new _prefix_, meaning a string that prefaces all its options. That way they do not collide. Thanks, Matt > Regards, > > Alexandre Scotto. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yl5kWH4le9b9AulQe-MHUTGMJnJLSAUWTWCPTgED4vuWOpxiWWocOZMgkQeMjxEXM-FS5bHHbw1ItcybTP-F$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.scotto at irt-saintexupery.com Mon Oct 27 05:48:43 2025 From: alexandre.scotto at irt-saintexupery.com (SCOTTO Alexandre) Date: Mon, 27 Oct 2025 10:48:43 +0000 Subject: [petsc-users] Options database in petsc4py In-Reply-To: References: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> Message-ID: Hello Matthew, I was indeed wondering if there was a clean way to match a given set of options to a specific KSP instance. Does the following makes sense: 1. Associate a given a KSP instance a unique string, e.g. ?toto? and call ksp_1.setOptionsPrefix(?toto?) 2. Create an option database with corresponding prefix: options = Options(?toto_ksp_?) 3. Then a call to ksp_1.setFromOptions() will only consider entrees in the option database starting with ?toto? Regards, Alexandre. De : Matthew Knepley Envoy? : lundi 27 octobre 2025 11:27 ? : SCOTTO Alexandre Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] Options database in petsc4py On Mon, Oct 27, 2025 at 10:24?AM SCOTTO Alexandre via petsc-users > wrote: Dear PETSc Community, In my developments, I am managing possibly several KSP solvers with options handled by the Options database. During my tests, I encountered the following behavior: Code: options = PETSc.Options("ksp_") options.setValue("atol", 7e-8) options.view() options.clear() options.view() Output: #PETSc Option Table entries: -ksp_atol 7e-08 # (source: code) #End of PETSc Option Table entries #PETSc Option Table entries: -ksp_atol 7e-08 # (source: code) #End of PETSc Option Table entries It seems that the clear() method does not really clear the Option database. To ensure that the several KSP I deal with are set with their own options (without getting options from a KSP previously set), the only way I found was to explicitly call the delValue() method for all the option keys passed: 1. Iterate over a dictionary of options and use setValue(name, value) 2. Set the KSP with option database: KSP.setFromOptions() 3. Iterate over a the keys of the dictionary and use delValue(name) to effectively clear the option database. Does it seem normal to you, is there something I am missing out? That for pointing out this bug. However, I don't think I would manage options this way. We normally give each separate solver a new _prefix_, meaning a string that prefaces all its options. That way they do not collide. Thanks, Matt Regards, Alexandre Scotto. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZnPxnrdflWTbGbGUEH1FqwWm8qDvjVXbO9Vcz0i-G4kvOUAXnvdAeosPUQrtGbXE74bfCgrSlkSnKeAt0Pd0lPTS4gzeowrsYHOBb_MjQw$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 27 12:54:40 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Oct 2025 18:54:40 +0100 Subject: [petsc-users] Options database in petsc4py In-Reply-To: References: <3fc195d369474a259369d3a87c4f2f5a@irt-saintexupery.com> Message-ID: On Mon, Oct 27, 2025 at 11:48?AM SCOTTO Alexandre < alexandre.scotto at irt-saintexupery.com> wrote: > Hello Matthew, > > > > I was indeed wondering if there was a clean way to match a given set of > options to a specific KSP instance. Does the following makes sense: > > 1. Associate a given a KSP instance a unique string, e.g. ?toto? > and call ksp_1.setOptionsPrefix(?toto?) > > 2. Create an option database with corresponding prefix: options = > Options(?toto_ksp_?) > > 3. Then a call to ksp_1.setFromOptions() will only consider entrees > in the option database starting with ?toto? > Yes, that is how it works, although you would need sp_1.setOptionsPrefix(?toto_?) Thanks, Matt > Regards, > > Alexandre. > > > > *De :* Matthew Knepley > *Envoy? :* lundi 27 octobre 2025 11:27 > *? :* SCOTTO Alexandre > *Cc :* petsc-users at mcs.anl.gov > *Objet :* Re: [petsc-users] Options database in petsc4py > > > > On Mon, Oct 27, 2025 at 10:24?AM SCOTTO Alexandre via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Dear PETSc Community, > > > > In my developments, I am managing possibly several KSP solvers with > options handled by the Options database. During my tests, I encountered the > following behavior: > > > > *Code*: > > options = PETSc.Options("ksp_") > > options.setValue("atol", 7e-8) > > options.view() > > > > options.clear() > > options.view() > > > > *Output*: > > #PETSc Option Table entries: > > -ksp_atol 7e-08 # (source: code) > > #End of PETSc Option Table entries > > > > #PETSc Option Table entries: > > -ksp_atol 7e-08 # (source: code) > > #End of PETSc Option Table entries > > > > It seems that the clear() method does not really clear the Option > database. To ensure that the several KSP I deal with are set with their own > options (without getting options from a KSP previously set), the only way I > found was to explicitly call the delValue() method for all the option > keys passed: > > > > 1. Iterate over a dictionary of options and use setValue(name, > value) > > 2. Set the KSP with option database: KSP.setFromOptions() > > 3. Iterate over a the keys of the dictionary and use delValue(name) > to effectively clear the option database. > > > > Does it seem normal to you, is there something I am missing out? > > > > That for pointing out this bug. > > > > However, I don't think I would manage options this way. We normally give > each separate solver a new _prefix_, meaning a string that prefaces all its > options. That way they do not collide. > > > > Thanks, > > > > Matt > > > > Regards, > > Alexandre Scotto. > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YMkD_E3cPBzKrGzS-WVqigVxG6DXZlQPuvqu6tLu_iC00xlFN2GbMAeTOK-jzVRPXE3L68ICKFUq6KCdZrvu$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YMkD_E3cPBzKrGzS-WVqigVxG6DXZlQPuvqu6tLu_iC00xlFN2GbMAeTOK-jzVRPXE3L68ICKFUq6KCdZrvu$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From curfman at anl.gov Tue Oct 28 12:34:10 2025 From: curfman at anl.gov (McInnes, Lois Curfman) Date: Tue, 28 Oct 2025 17:34:10 +0000 Subject: [petsc-users] Call for Survey Participants: Exploring Teamwork and Software in Scientific Computing Message-ID: <170E34C1-1BF9-4C1C-8903-4D616852D0FF@anl.gov> Hello PETSc users ? Our research team is conducting survey research to understand perceptions of teamwork, AI, and software in scientific computing. As a member of the scientific computing community, your perspective is invaluable. We are reaching out to invite you to share your point of view on teamwork and software in scientific computing via an online survey. You can complete this survey at a time and place of your choosing. Participation in this research will take approximately 15 minutes. If you are interested in contributing your perspectives to this research, use the following link to access the survey: https://urldefense.us/v3/__https://umt.co1.qualtrics.com/jfe/form/SV_6uicew8JJmavimO__;!!G_uCfscf7eWS!dW1im6s9VYYiogwjjee069HxGYrE424FHdwsLEo72qB_vm6SPJwuSjFV1pBrHfcQx88-2MSwQLDvRHHweuQpoEY$ . Please reach out to our research team with any questions you may have pertaining to this study. Feel free to share this call for participants with your network. The survey is open to all adults who work in scientific computing. This survey will remain open until mid-December. Thank you very much for your time and consideration, Research Team Olivia B. Newton, University of Montana (olivia.newton at umt.edu) Lois Curfman McInnes, Argonne National Laboratory (curfman at anl.gov) Anshu Dubey, Argonne National Laboratory (adubey at anl.gov) Denice Ward Hood, University of Illinois (dwhood at illinois.edu) -------------- next part -------------- An HTML attachment was scrubbed... URL: