<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Jan 9, 2021 at 5:14 PM Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank">bsmith@petsc.dev</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
If it is non-overlapping do you mean block Jacobi with multiple blocks per MPI rank? (one block per rank is trivial and should work now). <br></blockquote><div><br></div><div>Yes, only one MPI rank.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
If you mean block Jacobi with multiple blocks per MPI rank you should start with the PCApply_BJacob_Multiblock(). It monkey's with pointers into the vector and then calls KSPSolve() for each block. So you just need a non-blocking KSPSolve(). What KSP and what PC do you want to use per block? </blockquote><div><br>SuperLU</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">If you want to use LU then I think you proceed largely as I said a few days ago. All the routines in MatSolve_SeqAIJCUSparse can be made non-blocking as discussed with each one using its own stream (supplied with a hack or with approaches from Junchao and Jacob in the future possibly; but a hack is all that is needed for this trivial case.)<br></blockquote><div><br></div><div>I'm not sure what goes into this hack. </div><div><br></div><div>I am using fieldsplit and I now see that I don't specify asm, just lu, so I guess PCApply_FieldSplit is the target</div><div><br></div><div> KSP Object: 1 MPI processes<br> type: preonly<br> maximum iterations=10000, initial guess is zero<br> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br> left preconditioning<br> using NONE norm type for convergence test<br> PC Object: 1 MPI processes<br> type: fieldsplit<br> FieldSplit with ADDITIVE composition: total splits = 2<br> Solver info for each split is in the following KSP objects:<br> Split number 0 Defined by IS<br> KSP Object: (fieldsplit_e_) 1 MPI processes<br> type: preonly<br> maximum iterations=10000, initial guess is zero<br> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br> left preconditioning<br> using NONE norm type for convergence test<br> PC Object: (fieldsplit_e_) 1 MPI processes<br> type: lu<br> out-of-place factorization<br> tolerance for zero pivot 2.22045e-14<br> matrix ordering: nd<br> factor fill ratio given 5., needed 1.30805<br> Factored matrix follows:<br> Mat Object: 1 MPI processes<br> type: seqaij<br> rows=448, cols=448<br> package used to perform factorization: petsc<br> total: nonzeros=14038, allocated nonzeros=14038<br> using I-node routines: found 175 nodes, limit used is 5<br> linear system matrix = precond matrix:<br> Mat Object: (fieldsplit_e_) 1 MPI processes<br> type: seqaij<br> rows=448, cols=448<br> total: nonzeros=10732, allocated nonzeros=10732<br> total number of mallocs used during MatSetValues calls=0<br> using I-node routines: found 197 nodes, limit used is 5<br> Split number 1 Defined by IS<br> KSP Object: (fieldsplit_i1_) 1 MPI processes<br></div><div> ....</div><div><br></div></div></div>