<div dir="ltr"><div>I don't think bjacobi is working on GPUs. I know Dominic made a pull request a few months ago, but I don't know if its been integrated into next.<br></div>-Paul<br></div><div class="gmail_extra">

<br><br><div class="gmail_quote">On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley <span dir="ltr"><<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div class="">On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong <span dir="ltr"><<a href="mailto:jon.the.wong@gmail.com" target="_blank">jon.the.wong@gmail.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Thanks for the input. To clarify, I'm trying to compare GPU algorithms to Petsc, and they only have cg/jacobi for what I'm comparing at the moment. This is why I'm not using gmres (which also works well). <br>


<br>I can solve the problem with the GPU (custom code) using CG + jacobi for all the meshes. On the CPU side, I can solve everything with cg/bjacobi and almost all of my meshes with cg/jacobi except for my 50k node mesh. I can solve the problem with my finite element built-in direct solver (just takes awhile) on one processor. I've been reading that by default the bjacobi pc uses one block per processor. So I had assumed that for one processor block-jacobi and jacobi would give similar results. cg+bjacobi works fine. cg+jacobi does not.<br>


</div></div></blockquote><div><br></div></div><div>"Jacobi" means preconditioning by the inverse of the diagonal of the matrix. Block-Jacobi means using a preconditioner</div><div>formed from each of the blocks, in this case 1 block. By default the inner preconditioner is ILU(0), not jacobi. You can</div>


<div>make them equivalent using -sub_pc_type jacobi.</div><div><br></div><div>   Matt</div><div class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">

<div>

</div>I'll just look into the preconditioner code and use KSPview to try to figure out what the differences are for one processor. I'm not sure why the GPU can consistently solve the problem with cg/jacobi. I'm assuming this is due to the way round-off or the order of operations differences between the two.<br>


</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, May 19, 2014 at 6:35 AM, Jed Brown <span dir="ltr"><<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div>Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> writes:<br>

> No, Block-Jacobi and Jacobi are completely different. If you are not<br>

> positive definite, you should be using MINRES.<br>

<br>

</div>MINRES requires an SPD preconditioner.<br>

</blockquote></div><br></div>

</blockquote></div></div><br><br clear="all"><div class=""><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>


-- Norbert Wiener

</div></div></div>

</blockquote></div><br></div>