<div dir="ltr"><div dir="ltr">On Tue, Jun 27, 2023 at 2:56 PM Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov">marcos.vanella@nist.gov</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg539665228208736377">
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Thank you Matt. I'll try the flags you recommend for monitoring. Correct, I'm trying to see if GPU would provide an advantage for this particular Poisson solution we do in our code.
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Our grids are staggered with the Poisson unknown in cell centers. All my tests for single mesh runs with 100K to 200K meshes show MKL PARDISO as the faster option for these meshes considering the mesh as unstructured (an implementation separate from the PETSc
option). We have the option of Fishpack (fast trigonometric solvers), but that is not as general (requires solution on the whole mesh + a special treatment of immersed geometry). The single mesh solver is used as a black box within a fixed point domain decomposition
iteration in multi-mesh cases. The approximation error in this method is confined to the mesh boundaries.
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
The other option I have tried with MKL is to build the global matrix across all meshes and use the MKL cluster sparse solver. The problem becomes a memory one for meshes that go over a couple million unknowns due to the exact Cholesky factorization matrix storage.
I'm thinking the other possibility using PETSc is to build in parallel the global matrix (as done for the MKL global solver) and try the GPU accelerated Krylov + multigrid preconditioner. If this can bring down the time to solution to what we get for the previous
scheme and keep memory use undrr control it would be a good option for CPU+GPU systems. Thing is we need to bring the residual of the equation to ~10^-10 or less to avoid instability so it might still be costly.</div></div></div></blockquote><div><br></div><div>Yes, this is definitely the option I would try. First, I would just use AMG (GAMG, Hypre, ML). If those work,</div><div>you can speed up the setup time and bring down memory somewhat with GMG. Since your grid is Cartesian, you could use DMDA to do this easily.</div><div><br></div><div> Thanks,</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg539665228208736377"><div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I'll keep you updated. Thanks,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Marcos<br>
</div>
<div id="m_539665228208736377appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="m_539665228208736377divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>><br>
<b>Sent:</b> Tuesday, June 27, 2023 2:08 PM<br>
<b>To:</b> Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov" target="_blank">marcos.vanella@nist.gov</a>><br>
<b>Cc:</b> Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>>; <a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><br>
<b>Subject:</b> Re: [petsc-users] SOLVE + PC combination for 7 point stencil (unstructured) poisson solution</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div dir="ltr">On Tue, Jun 27, 2023 at 11:23 AM Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov" target="_blank">marcos.vanella@nist.gov</a>> wrote:<br>
</div>
<div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Hi Mark and Matt, I tried swapping the preconditioner to cholmod and also the hypre Boomer AMG. They work just fine for my case. I also got my hands on a machine with NVIDIA gpus in one of our AI clusters. I compiled PETSc to make use of cuda and cuda-enabled
openmpi (with gcc).</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I'm running the previous tests and want to also check some of the cuda enabled solvers. I was able to submit a case for the default Krylov solver with these runtime flags: -vec_type seqcuda -mat_type seqaijcusparse -pc_type cholesky -pc_factor_mat_solver_type
cusparse. The case run to completion.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I guess my question now is how do I monitor (if there is a way) that the GPU is being used in the calculation, and any other stats?</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>You should get that automatically with</div>
<div><br>
</div>
<div> -log_view</div>
<div><br>
</div>
<div>If you want finer-grained profiling of the kernels, you can use</div>
<div><br>
</div>
<div> -log_view_gpu_time</div>
<div><br>
</div>
<div>but it can slows things down.</div>
<div> </div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Also, which other solver combination using GPU would you recommend for me to try? Can we compile PETSc with the cuda enabled version for CHOLMOD and HYPRE?</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Hypre has GPU support but not CHOLMOD. There are no rules of thumb right now for GPUs. It depends on what card you have, what version of the driver, what version of the libraries, etc. It is very fragile. Hopefully this period ends soon, but I am not optimistic.
Unless you are very confident that GPUs will help,</div>
<div>I would not recommend spending the time.</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Thank you for your help!</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Marcos <br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div id="m_539665228208736377x_m_-3817469500869486181appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="m_539665228208736377x_m_-3817469500869486181divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>><br>
<b>Sent:</b> Monday, June 26, 2023 12:11 PM<br>
<b>To:</b> Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov" target="_blank">marcos.vanella@nist.gov</a>><br>
<b>Cc:</b> Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>>;
<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><br>
<b>Subject:</b> Re: [petsc-users] SOLVE + PC combination for 7 point stencil (unstructured) poisson solution</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div dir="ltr">On Mon, Jun 26, 2023 at 12:08 PM Vanella, Marcos (Fed) via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br>
</div>
<div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Than you Matt and Mark, I'll try your suggestions. To configure with hypre can I just use the --download-hypre configure line?</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Yes,</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
That is what I did with suitesparse, very nice.<br>
</div>
<div id="m_539665228208736377x_m_-3817469500869486181x_m_-5320235529926307471appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="m_539665228208736377x_m_-3817469500869486181x_m_-5320235529926307471divRplyFwdMsg" dir="ltr">
<font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>><br>
<b>Sent:</b> Monday, June 26, 2023 12:05 PM<br>
<b>To:</b> Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov" target="_blank">marcos.vanella@nist.gov</a>><br>
<b>Cc:</b> <a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><br>
<b>Subject:</b> Re: [petsc-users] SOLVE + PC combination for 7 point stencil (unstructured) poisson solution</font>
<div> </div>
</div>
<div>
<div dir="ltr">I'm not sure what MG is doing with an "unstructured" problem. I assume you are not using DMDA.
<div>-pc_type gamg should work <br>
</div>
<div>I would configure with hypre and try that also: -pc_type hypre</div>
<div><br>
</div>
<div>As Matt said MG should be faster. How many iterations was it taking?</div>
<div>Try a 100^3 and check that the iteration count does not change much, if at all.</div>
<div><br>
</div>
<div>Mark</div>
<div><br>
</div>
</div>
<br>
<div>
<div dir="ltr">On Mon, Jun 26, 2023 at 11:35 AM Vanella, Marcos (Fed) via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br>
</div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Hi, I was wondering if anyone has experience on what combinations are more efficient to solve a Poisson problem derived from a 7 point stencil on a single mesh (serial).</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I've been doing some tests of multigrid and cholesky on a 50^3 mesh. <b>-pc_type mg</b> takes about 75% more time than
<b>-pc_type cholesky -pc_factor_mat_solver_type cholmod</b> for the case I'm testing.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I'm new to PETSc so any suggestions are most welcome and appreciated,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Marcos<br>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
<span>-- </span><br>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
<span>-- </span><br>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div></blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>