<div dir="ltr"><div dir="ltr">FYI, I seem to have the new GPU machine at ORNL (summitdev) working with GPUs. That is good enough for now.</div><div dir="ltr">Thanks,<br><div><br></div><div><div>14:00 master= ~/petsc/src/snes/examples/tutorials$ jsrun -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -pc_type none -ksp_type fgmres -snes_monitor_short -snes_rtol 1.e-5 -ksp_view</div><div>lid velocity = 0.0625, prandtl # = 1., grashof # = 1.</div><div> 0 SNES Function norm 0.239155 </div><div>KSP Object: 1 MPI processes</div><div> type: fgmres</div><div> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div> happy breakdown tolerance 1e-30</div><div> maximum iterations=10000, initial guess is zero</div><div> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.</div><div> right preconditioning</div><div> using UNPRECONDITIONED norm type for convergence test</div><div>PC Object: 1 MPI processes</div><div> type: none</div><div> linear system matrix = precond matrix:</div><div> Mat Object: 1 MPI processes</div><div> type: seqaijcusparse</div><div> rows=64, cols=64, bs=4</div><div> total: nonzeros=1024, allocated nonzeros=1024</div><div> total number of mallocs used during MatSetValues calls =0</div><div> using I-node routines: found 16 nodes, limit used is 5</div><div> 1 SNES Function norm 6.82338e-05 </div><div>KSP Object: 1 MPI processes</div><div> type: fgmres</div><div> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div> happy breakdown tolerance 1e-30</div><div> maximum iterations=10000, initial guess is zero</div><div> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.</div><div> right preconditioning</div><div> using UNPRECONDITIONED norm type for convergence test</div><div>PC Object: 1 MPI processes</div><div> type: none</div><div> linear system matrix = precond matrix:</div><div> Mat Object: 1 MPI processes</div><div> type: seqaijcusparse</div><div> rows=64, cols=64, bs=4</div><div> total: nonzeros=1024, allocated nonzeros=1024</div><div> total number of mallocs used during MatSetValues calls =0</div><div> using I-node routines: found 16 nodes, limit used is 5</div><div> 2 SNES Function norm 3.346e-10 </div><div>Number of SNES iterations = 2</div><div>14:01 master= ~/petsc/src/snes/examples/tutorials$ </div></div><div><br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Nov 1, 2018 at 9:33 AM Mark Adams <<a href="mailto:mfadams@lbl.gov">mfadams@lbl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Wed, Oct 31, 2018 at 12:30 PM Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Wed, Oct 31, 2018 at 6:59 AM Karl Rupp <<a href="mailto:rupp@iue.tuwien.ac.at" target="_blank">rupp@iue.tuwien.ac.at</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Mark,<br>
<br>
ah, I was confused by the Python information at the beginning of <br>
configure.log. So it is picking up the correct compiler.<br>
<br>
Have you tried uncommenting the check for GNU?<br></blockquote></div></div></blockquote><div><br></div><div>Yes, but I am getting an error that the cuda files do not find mpi.h.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"></blockquote><div><br></div><div>I'm getting a make error.</div><div><br></div><div>Thanks, </div></div></div>
</blockquote></div></div>
</blockquote></div>