[petsc-dev] Kokkos/Crusher perforance

Mark Adams mfadams at lbl.gov
Tue Jan 25 20:24:33 CST 2022


It looks like we have our instrumentation and job configuration in decent
shape so on to scaling with AMG.
In using multiple nodes I got errors with table entries not found, which
can be caused by a buggy MPI, and the problem does go away when I turn GPU
aware MPI off.
Jed's analysis, if I have this right, is that at *0.7T* flops we are at
about 35% of theoretical peal wrt memory bandwidth.
I run out of memory with the next step in this study (7 levels of
refinement), with 2M equations per GPU. This seems low to me and we will
see if we can fix this.
So this 0.7Tflops is with only 1/4 M equations so 35% is not terrible.
Here are the solve times with 001, 008 and 064 nodes, and 5 or 6 levels of
refinement.

out_001_kokkos_Crusher_5_1.txt:KSPSolve              10 1.0 1.2933e+00 1.0
4.13e+10 1.1 1.8e+05 8.4e+03 5.8e+02  3 87 86 78 48 100100100100100 248792
  423857   6840 3.85e+02 6792 3.85e+02 100
out_001_kokkos_Crusher_6_1.txt:KSPSolve              10 1.0 5.3667e+00 1.0
3.89e+11 1.0 2.1e+05 3.3e+04 6.7e+02  2 87 86 79 48 100100100100100 571572
  *700002*   7920 1.74e+03 7920 1.74e+03 100
out_008_kokkos_Crusher_5_1.txt:KSPSolve              10 1.0 1.9407e+00 1.0
4.94e+10 1.1 3.5e+06 6.2e+03 6.7e+02  5 87 86 79 47 100100100100100 1581096
  3034723   7920 6.88e+02 7920 6.88e+02 100
out_008_kokkos_Crusher_6_1.txt:KSPSolve              10 1.0 7.4478e+00 1.0
4.49e+11 1.0 4.1e+06 2.3e+04 7.6e+02  2 88 87 80 49 100100100100100 3798162
  5557106   9367 3.02e+03 9359 3.02e+03 100
out_064_kokkos_Crusher_5_1.txt:KSPSolve              10 1.0 2.4551e+00 1.0
5.40e+10 1.1 4.2e+07 5.4e+03 7.3e+02  5 88 87 80 47 100100100100100
11065887   23792978   8684 8.90e+02 8683 8.90e+02 100
out_064_kokkos_Crusher_6_1.txt:KSPSolve              10 1.0 1.1335e+01 1.0
5.38e+11 1.0 5.4e+07 2.0e+04 9.1e+02  4 88 88 82 49 100100100100100
24130606   43326249   11249 4.26e+03 11249 4.26e+03 100

On Tue, Jan 25, 2022 at 1:49 PM Mark Adams <mfadams at lbl.gov> wrote:

>
>> Note that Mark's logs have been switching back and forth between
>> -use_gpu_aware_mpi and changing number of ranks -- we won't have that
>> information if we do manual timing hacks. This is going to be a routine
>> thing we'll need on the mailing list and we need the provenance to go with
>> it.
>>
>
> GPU aware MPI crashes sometimes so to be safe, while debugging, I had it
> off. It works fine here so it has been on in the last tests.
> Here is a comparison.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220125/acf28281/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tt.tar
Type: application/x-tar
Size: 317440 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220125/acf28281/attachment-0001.tar>


More information about the petsc-dev mailing list