<div dir="ltr">Yea, Stefano mentioned this and I would also like to see this not be a fatal error.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Nov 15, 2021 at 9:26 AM Jacob Faibussowitsch <<a href="mailto:jacob.fai@gmail.com">jacob.fai@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;"><blockquote type="cite"><div dir="ltr" style="font-family:Menlo-Regular"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">> [0]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI.<br>> [0]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To not see the message again, add the option to your .petscrc, OR add it to the env var PETSC_OPTIONS.<br>> [0]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu.<br>> [0]PETSC ERROR: For OpenMPI, you need to configure it --with-cuda (<a href="https://www.open-mpi.org/faq/?category=buildcuda" target="_blank">https://www.open-mpi.org/faq/?category=buildcuda</a>)<br>> [0]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (<a href="http://mvapich.cse.ohio-state.edu/userguide/gdr/" target="_blank">http://mvapich.cse.ohio-state.edu/userguide/gdr/</a>)<br>> [0]PETSC ERROR: For Cray-MPICH, you need to set MPICH_RDMA_ENABLED_CUDA=1 (<a href="https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/" target="_blank">https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/</a>)</div></blockquote></div></div></blockquote><div><br></div>You seem to also be tripping up the gpu aware mpi checker. IIRC we discussed removing this at some point? I think Stefano mentioned we now do this check at configure time?<div><br></div><div><div>
<div dir="auto" style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div dir="auto" style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div>Best regards,<br><br>Jacob Faibussowitsch<br>(Jacob Fai - booss - oh - vitch)<br></div></div></div>
</div>
<div><br><blockquote type="cite"><div>On Nov 13, 2021, at 22:57, Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>> wrote:</div><br><div><div dir="ltr" style="font-family:Menlo-Regular;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div dir="ltr"><br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Nov 13, 2021 at 2:24 PM Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>I have a user that wants CUDA + Hypre on Sumit and they want to use OpenMP in their code. I configured with openmp but without thread safety and got this error.</div><div><br></div><div>Maybe there is no need for us to do anything with omp in our configuration. Not sure.</div><div><br></div>15:08 main= summit:/gpfs/alpine/csc314/scratch/adams/petsc$ make PETSC_DIR=/gpfs/alpine/world-shared/geo127/petsc/arch-opt-gcc9.1.0-omp-cuda11.0.3 PETSC_ARCH="" check<br>Running check examples to verify correct installation<br>Using PETSC_DIR=/gpfs/alpine/world-shared/geo127/petsc/arch-opt-gcc9.1.0-omp-cuda11.0.3 and PETSC_ARCH=<br>C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process<br>Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes<br>See<span> </span><a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html</a><br>[1] (280696) Warning: Could not find key lid0:0:2 in cache <=========================<br>[1] (280696) Warning: Could not find key qpn0:0:0:2 in cache <=========================<br>Unable to connect queue-pairs<br>[h37n08:280696] Error: common_pami.c:1094 - ompi_common_pami_init() 1: Unable to create 1 PAMI communication context(s) rc=1<br></div></blockquote><div>I don't know what petsc's thread safety is. But this error seems to be in the environment. You can report to OLCF help.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">--------------------------------------------------------------------------<br>No components were able to be opened in the pml framework.<br><br>This typically means that either no components of this type were<br>installed, or none of the installed components can be loaded.<br>Sometimes this means that shared libraries required by these<br>components are unable to be found/loaded.<br><br> <span> </span>Host: h37n08<br> <span> </span>Framework: pml<br>--------------------------------------------------------------------------<br>[h37n08:280696] PML pami cannot be selected<br>1,5c1,16<br>< lid velocity = 0.0016, prandtl # = 1., grashof # = 1.<br>< 0 SNES Function norm 0.0406612<br>< 1 SNES Function norm 4.12227e-06<br>< 2 SNES Function norm 6.098e-11<br>< Number of SNES iterations = 2<br>---<br>> [1] (280721) Warning: Could not find key lid0:0:2 in cache <=========================<br>> [1] (280721) Warning: Could not find key qpn0:0:0:2 in cache <=========================<br>> Unable to connect queue-pairs<br>> [h37n08:280721] Error: common_pami.c:1094 - ompi_common_pami_init() 1: Unable to create 1 PAMI communication context(s) rc=1<br>> --------------------------------------------------------------------------<br>> No components were able to be opened in the pml framework.<br>><br>> This typically means that either no components of this type were<br>> installed, or none of the installed components can be loaded.<br>> Sometimes this means that shared libraries required by these<br>> components are unable to be found/loaded.<br>><br>> Host: h37n08<br>> Framework: pml<br>> --------------------------------------------------------------------------<br>> [h37n08:280721] PML pami cannot be selected<br>/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials<br>Possible problem with ex19 running with hypre, diffs above<br>=========================================<br>2,15c2,15<br>< 0 SNES Function norm 2.391552133017e-01<br>< 0 KSP Residual norm 2.325621076120e-01<br>< 1 KSP Residual norm 1.654206318674e-02<br>< 2 KSP Residual norm 7.202836119880e-04<br>< 3 KSP Residual norm 1.796861424199e-05<br>< 4 KSP Residual norm 2.461332992052e-07<br>< 1 SNES Function norm 6.826585648929e-05<br>< 0 KSP Residual norm 2.347339172985e-05<br>< 1 KSP Residual norm 8.356798075993e-07<br>< 2 KSP Residual norm 1.844045309619e-08<br>< 3 KSP Residual norm 5.336386977405e-10<br>< 4 KSP Residual norm 2.662608472862e-11<br>< 2 SNES Function norm 6.549682264799e-11<br>< Number of SNES iterations = 2<br>---<br>> [0]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI.<br>> [0]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To not see the message again, add the option to your .petscrc, OR add it to the env var PETSC_OPTIONS.<br>> [0]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu.<br>> [0]PETSC ERROR: For OpenMPI, you need to configure it --with-cuda (<a href="https://www.open-mpi.org/faq/?category=buildcuda" target="_blank">https://www.open-mpi.org/faq/?category=buildcuda</a>)<br>> [0]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (<a href="http://mvapich.cse.ohio-state.edu/userguide/gdr/" target="_blank">http://mvapich.cse.ohio-state.edu/userguide/gdr/</a>)<br>> [0]PETSC ERROR: For Cray-MPICH, you need to set MPICH_RDMA_ENABLED_CUDA=1 (<a href="https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/" target="_blank">https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/</a>)<br>> --------------------------------------------------------------------------<br>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF<br>> with errorcode 76.<br>><br>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.<br>> You may or may not see output from other processes, depending on<br>> exactly when Open MPI kills them.<br>> --------------------------------------------------------------------------<br>/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials<br>Possible problem with ex19 running with cuda, diffs above<br>=========================================</div></blockquote></div></div></div></blockquote></div><br></div></div></blockquote></div>