[petsc-users] Block preconditioning for 3d problem

Smith, Barry F. bsmith at mcs.anl.gov
Mon Oct 14 00:07:45 CDT 2019


  Thanks. It seems that the iterative method is solved the entire linear system very accurately. For the last linear solve the initial true residual norm is around 10^6 (i.e. the norm of b) and the final one around 10^-10

  The initial  true residual norm on each block (part of a slab on one process) is around 10^-1 so I don't see a huge difference in scaling between the blocks. The final true residual for blocks is around 10^-17 so each block is well solved.

   What do you mean by "quasi-Newton"? Are you doing any kind of line search or just using the full step from the linear solve? 

   When you solve with the single KSP over all slabs the convergence is determined by ||r_all||/||b_all|| < tol while slab by slab it is ||r_slab||/||b_slab|| < tol for each slab. If the slabs had hugely different scalings this could result in different "answers" but since the slabs seem to have very similar scalings the "answers" should be very similar. 

   I would do the following. Solve the first linear system both ways and then compute the difference in the solution to the linear systems computed both ways. Relative to ||b|| the norm of the difference of the two linear systems solutions should be 10^-15 indicating both approaches produced almost identical answers. If this is the case then the problem is elsewhere, not in the solution of the linear system. On the other hand if the difference is large then likely the problem is in the formulation of the linear system, perhaps the combined linear system is not actually the individual linear systems.

  Barry

By the way, using a block Jacobi with Jacobi as the preconditioner on each block  and a really tight tolerance for each block is not a practical method for larger problems, though I think it is ok for this test.

   



> On Oct 13, 2019, at 6:20 AM, Dave Lee <davelee2804 at gmail.com> wrote:
> 
> Hi Barry,
> 
> I've answered your questions as follows:
> 
>  So each slab lives on all the processes?
> 
> Yes, that's correct.
> 
> How many processes and how many slabs are you running with?
> 
> For my test problem there are 6 processors and 8 slabs
> 
> There will be num_slabs blocks per processor times the number of processors. Depending on the layout of the unknowns each block will consist of a entire local slap or each block will consist of pieces from each slap.
> 
> The internal slab DOF is the minor index. Each processor is responsible for 144 DOFs per slab, so each processor is responsible for 8x144 DOFs. When I run with -ksp_view there are 8 blocks on each processor (48 total), each with 144 DOF's, so I'm assuming everything is ok there.
> 
>  Please run the linear solver with
> 
>     -ksp_monitor_true_residual -sub_ksp_monitor_true_residual -ksp_converged_reason -sub_ksp_converged_reason
> 
> 
>   Do you get huge iteration counts somewhere?
> 
> Not that I can see, file is attached (log.gmres)
> 
> Next run the same thing but add -ksp_type fgmres does that change anything?
> 
> I don't think so. The convergence is still the same, file is attached (log.fgmres).
> 
> One thing I've noticed - I've been dumping the index of the slab with the maximum normalised error: norm_L2(delta_x) / norm_L2(x). If I solve for each slab separately, then the maximum normalised errors are typically in slab 3 (of 0-7). This makes sense as this is probably the most dynamically active layer of the atmosphere. However if I solve for all the slabs in a single solve (as is the cases for these tests), then the greatest error is always in slab 7, the top of the atmosphere. Obviously the pressure is lowest in this level, so I'm wondering if the GMRES iterations are just weighting for the lower layers of the atmosphere too much.
> 
> Cheers, Dave.
> 
> On Sun, Oct 13, 2019 at 3:42 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
> 
> > On Oct 10, 2019, at 2:58 AM, Dave Lee via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > 
> > Hi PETSc,
> > 
> > I have a nonlinear 3D problem for a set of uncoupled 2D slabs. (Which I ultimately want to couple once this problem is solved).
> > 
> > When I solve the inner linear problem for each of these 2D slabs individually (using KSPGMRES) the convergence of the outer nonlinear problem is good. However when I solve the inner linear problem as a single 3D problem (with no coupling between the 2D slabs, so the matrix is effectively a set of uncoupled blocks, one for each 2D slab) the outer nonlinear convergence degrades dramatically.
> > 
> > Note that I am not using SNES, just my own quasi-Newton approach for the outer nonlinear problem.
> > 
> > I suspect that the way to recover the convergence for the 3D coupled problem is to use some sort of PCBJACOBI or PCFIELDSPLIT preconditioner, but I'm not entirely sure. I've tried following this example:
> > https://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex7.c.html
> > but without any improvement in the convergence. 
> > 
> > On each processor the slab index is the major index and the internal slab DOF index is the minor index. The parallel decomposition is within the 2D slab dimensions only, not between slabs.
> 
>   So each slab lives on all the processes?
> 
> > 
> > I'm configuring the 3D inner linear solver as
> 
>   How many processes and how many slabs are you running with? 
> 
> > 
> > KSPGetPC(ksp, &pc);
> > PCSetType(pc, PCBJACOBI);
> > PCBJacobiSetLocalBlocks(pc, num_slabs, NULL);
> 
>     There will be num_slabs blocks per processor times the number of processors. Depending on the layout of the unknowns each block will consist of a entire local slap or each block will consist of pieces from each slap. 
> 
> > KSPSetUp(ksp);
> > PCBJacobiGetSubKSP(pc, &nlocal, &first_local, &subksp);
> > for(int ii = 0; ii < nlocal; ii++) {
> >     KSPGetPC(subksp[ii], &subpc);
> >     PCSetType(subpc, PCJACOBI);
> >     KSPSetType(subksp[ii], KSPGMRES);
> >     KSPSetTolerances(subksp[ii], 1.e-12, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT);
> > }
> 
>   Please run the linear solver with 
> 
>     -ksp_monitor_true_residual -sub_ksp_monitor_true_residual -ksp_converged_reason -sub_ksp_converged_reason 
> 
> 
>   Do you get huge iteration counts somewhere? 
> 
>   Next run the same thing but add -ksp_type fgmres does that change anything?
> 
>   Send all the results in an attached file.
> 
> Barry
> 
> 
> 
> > 
> > However this does not seem to change the outer nonlinear convergence. When I run with ksp_view the local block sizes look correct - number of local 2D slab DOFs for each block on each processor.
> > 
> > Any ideas on what sort of KSP/PC configuration I should be using to recover the convergence of the uncoupled 2D slab linear solve for this 3D linear solve? Should I be rethinking my node indexing to help with this?
> > 
> > Cheers, Dave.
> 
> <log.fgmres><log.gmres>



More information about the petsc-users mailing list