<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div> You have debugging turned on on crusher but not permutter<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jan 23, 2022, at 6:37 PM, Mark Adams <<a href="mailto:mfadams@lbl.gov" class="">mfadams@lbl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">* Perlmutter is roughly 5x faster than Crusher on the one node 2M eq test. (small)<div class="">This is with 8 processes. <br class=""></div><div class=""><br class=""></div><div class="">* The next largest version of this test, 16M eq total and 8 processes, fails in memory allocation in the mat-mult setup in the Kokkos Mat.</div><div class=""><br class=""></div><div class="">* If I try to run with 64 processes on Perlmutter I get this error in initialization. These nodes have 160 Gb of memory.</div><div class="">(I assume this is related to these large memory requirements from loading packages, etc....)</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Mark</div><div class=""><br class=""></div><div class="">+ srun -n64 -N1 --cpu-bind=cores --ntasks-per-core=1 ../ex13 -dm_plex_box_faces 4,4,4 -petscpartitioner_simple_process_grid 4,4,4 -dm_plex_box_upper 1,1,1 -petscpartitioner_simple_node_grid 1,1,1 -dm_refine 6 -dm_view -pc_type jacobi -log<br class="">_view -ksp_view -use_gpu_aware_mpi false -dm_mat_type aijkokkos -dm_vec_type kokkos -log_trace<br class="">+ tee jac_out_001_kokkos_Perlmutter_6_8.txt<br class="">[48]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------<br class="">[48]PETSC ERROR: GPU error <br class="">[48]PETSC ERROR: cuda error 2 (cudaErrorMemoryAllocation) : out of memory<br class="">[48]PETSC ERROR: See <a href="https://petsc.org/release/faq/" class="">https://petsc.org/release/faq/</a> for trouble shooting.<br class="">[48]PETSC ERROR: Petsc Development GIT revision: v3.16.3-683-gbc458ed4d8 GIT Date: 2022-01-22 12:18:02 -0600<br class="">[48]PETSC ERROR: /global/u2/m/madams/petsc/src/snes/tests/data/../ex13 on a arch-perlmutter-opt-gcc-kokkos-cuda named nid001424 by madams Sun Jan 23 15:19:56 2022<br class="">[48]PETSC ERROR: Configure options --CFLAGS=" -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4" --CXXFLAGS=" -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4" --CUDAFLAGS="-g -Xcompiler -rdynamic -DLANDAU_DIM=2 -DLAN<br class="">DAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4" --with-cc=cc --with-cxx=CC --with-fc=ftn --LDFLAGS=-lmpifort_gnu_91 --with-cudac=/global/common/software/nersc/cos1.3/cuda/11.3.0/bin/nvcc --COPTFLAGS=" -O3" --CXXOPTFLAGS=" -O3" --FOPTFLAGS=" -O3"<br class=""> --with-debugging=0 --download-metis --download-parmetis --with-cuda=1 --with-cuda-arch=80 --with-mpiexec=srun --with-batch=0 --download-p4est=1 --with-zlib=1 --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --with-<br class="">make-np=8 PETSC_ARCH=arch-perlmutter-opt-gcc-kokkos-cuda<br class="">[48]PETSC ERROR: #1 initialize() at /global/u2/m/madams/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:72<br class="">[48]PETSC ERROR: #2 initialize() at /global/u2/m/madams/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:343<br class="">[48]PETSC ERROR: #3 PetscDeviceInitializeTypeFromOptions_Private() at /global/u2/m/madams/petsc/src/sys/objects/device/interface/device.cxx:319<br class="">[48]PETSC ERROR: #4 PetscDeviceInitializeFromOptions_Internal() at /global/u2/m/madams/petsc/src/sys/objects/device/interface/device.cxx:449<br class="">[48]PETSC ERROR: #5 PetscInitialize_Common() at /global/u2/m/madams/petsc/src/sys/objects/pinit.c:963<br class="">[48]PETSC ERROR: #6 PetscInitialize() at /global/u2/m/madams/petsc/src/sys/objects/pinit.c:1238<br class=""><div class=""><br class=""></div></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jan 23, 2022 at 8:58 AM Mark Adams <<a href="mailto:mfadams@lbl.gov" class="">mfadams@lbl.gov</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div dir="ltr" class=""><br class=""></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Jan 22, 2022 at 6:22 PM Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank" class="">bsmith@petsc.dev</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class=""><div class=""><br class=""></div><div class=""> I cleaned up Mark's last run and put it in a fixed-width font. I realize this may be too difficult but it would be great to have identical runs to compare with on Summit.</div></div></blockquote><div class=""><br class=""></div><div class="">I was planning on running this on Perlmutter today, as well as some sanity checks like all GPUs are being used. I'll try PetscDeviceView.</div><div class=""><br class=""></div><div class="">Junchao modified the timers and all GPU > CPU now, but he seemed to move the timers more outside and Barry wants them tight on the "kernel".</div><div class="">I think Junchao is going to work on that so I will hold off.</div><div class="">(I removed the the Kokkos wait stuff and seemed to run a little faster but I am not sure how deterministic the timers are, and I did a test with GAMG and it was fine.)<br class=""></div><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class=""><div class=""><br class=""></div><div class=""><br class=""></div><div class=""> As Jed noted Scatter takes a long time but the pack and unpack take no time? Is this not timed if using Kokkos?</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">--- Event Stage 2: KSP Solve only</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class=""><br class=""></span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">MatMult 400 1.0 8.8003e+00 1.1 1.06e+11 1.0 2.2e+04 8.5e+04 0.0e+00 2 55 61 54 0 70 91100100 95,058 132,242 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecScatterBegin 400 1.0 1.3391e+00 2.6 0.00e+00 0.0 2.2e+04 8.5e+04 0.0e+00 0 0 61 54 0 7 0100100 0 0 0 0.00e+00 0 0.00e+00 0</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecScatterEnd 400 1.0 1.3240e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">SFPack 400 1.0 1.8276e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">SFUnpack 400 1.0 6.2653e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class=""><br class=""></span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">KSPSolve 2 1.0 1.2540e+01 1.0 1.17e+11 1.0 2.2e+04 8.5e+04 1.2e+03 3 60 61 54 60 100100100 73,592 116,796 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecTDot 802 1.0 1.3551e+00 1.2 3.36e+09 1.0 0.0e+00 0.0e+00 8.0e+02 0 2 0 0 40 10 3 0 19,627 52,599 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecNorm 402 1.0 9.0151e-01 2.2 1.69e+09 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 5 1 0 0 14,788 125,477 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecAXPY 800 1.0 8.2617e-01 1.0 3.36e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 7 3 0 0 32,112 61,644 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecAYPX 398 1.0 8.1525e-01 1.6 1.67e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 5 1 0 0 16,190 20,689 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class="">VecPointwiseMult 402 1.0 3.5694e-01 1.0 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 1 0 0 18,675 38,633 0 0.00e+00 0 0.00e+00 100</span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class=""><br class=""></span></font></div><div class=""><font face="Courier New" class=""><span style="font-style:normal;font-size:12px" class=""><br class=""></span></font></div><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Jan 22, 2022, at 12:40 PM, Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank" class="">mfadams@lbl.gov</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class="">And I have a new MR with if you want to see what I've done so far.</div>
</div></blockquote></div><br class=""></div></blockquote></div></div>
</blockquote></div>
<span id="cid:f_kyrvbhb40"><jac_out_001_kokkos_Crusher_6_1_notpl.txt></span><span id="cid:f_kyrvbhb81"><jac_out_001_kokkos_Perlmutter_6_1.txt></span></div></blockquote></div><br class=""></body></html>