performance issue

Jim Edwards jedwards at ucar.edu
Fri Aug 11 12:35:08 CDT 2023


Here is my run script on perlmutter:

#!/usr/bin/env python
#
#SBATCH -A mp9
#SBATCH -C cpu
#SBATCH --qos=regular
#SBATCH --time=15
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=128

import os
import glob

with open("pioperf.nl","w") as fd:
    fd.write("&pioperf\n")
    fd.write("  decompfile='ROUNDROBIN'\n")
#    for filename in decompfiles:
#        fd.write("   '"+filename+"',\n")
    fd.write(" varsize=18560\n");
    fd.write(" pio_typenames = 'pnetcdf','pnetcdf'\n");
    fd.write(" rearrangers = 2\n");
    fd.write(" nframes = 1\n");
    fd.write(" nvars = 64\n");
    fd.write(" niotasks = 16\n");
    fd.write(" /\n")

os.system("srun -n 2048 ~/parallelio/bld/tests/performance/pioperf ")


Module environment:
Currently Loaded Modules:
  1) craype-x86-milan                        6) cpe/23.03
 11) craype-accel-nvidia80        16) craype/2.7.20          21)
cmake/3.24.3
  2) libfabric/1.15.2.0                      7) xalt/2.10.2
 12) gpu/1.0                      17) cray-dsmml/0.2.2       22)
cray-parallel-netcdf/1.12.3.3
  3) craype-network-ofi                      8) Nsight-Compute/2022.1.1
 13) evp-patch                    18) cray-mpich/8.1.25      23) cray-hdf5/
1.12.2.3
  4) xpmem/2.5.2-2.4_3.49__gd0f7936.shasta   9) Nsight-Systems/2022.2.1
 14) python/3.9-anaconda-2021.11  19) cray-libsci/23.02.1.1  24)
cray-netcdf/4.9.0.3
  5) perftools-base/23.03.0                 10) cudatoolkit/11.7
15) intel/2023.1.0               20) PrgEnv-intel/8.3.3

cmake command:
 CC=mpicc FC=mpifort cmake
-DPNETCDF_DIR=$CRAY_PARALLEL_NETCDF_DIR/intel/19.0
-DNETCDF_DIR=$CRAY_NETCDF_PREFIX -DHAVE_PAR_FILTERS=OFF ../

There are a couple of issues with the build that can be fixed by editing
file config.h (created in the bld directory by cmake)

Add the following to config.h:

#undef NC_HAS_ZSTD
#undef NC_HAS_BZ2

then:
make pioperf

once it's built run the submit script from $SCRATCH

On Fri, Aug 11, 2023 at 11:13 AM Wei-Keng Liao <wkliao at northwestern.edu>
wrote:

> OK. I will test it myself on Perlmutter.
> Do you have a small test program to reproduce or is it still pioperf?
> If pioperf, are the build instructions on Perlmutter the same?
>
> Please let me know how you run on Perlmutter, i.e. no. process, nodes,
> Lustre striping, problem size, etc.
>
> Does "1 16 64" in your results mean 16 I/O tasks and 64 variables,
>
yes this is correct



> and only 16 MPI processes out of total ? processes call PnetCDF APIs?
>
>
yes this is also correct.



> Wei-keng
>
> On Aug 11, 2023, at 9:35 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>
> I tried on perlmutter and am seeing the same issue only maybe even worse:
>
> RESULT: write    SUBSET         1        16        64     1261.0737058071
>       14.7176171500
> RESULT: write    SUBSET         1        16        64       90.3736534450
>      205.3695882870
>
>
> On Fri, Aug 11, 2023 at 8:17 AM Jim Edwards <jedwards at ucar.edu> wrote:
>
>> Hi Wei-Keng,
>>
>> I released that the numbers in this table are all showing the slow
>> performing file and the fast file
>> (the one without the scalar variable) are not represented - I will rerun
>> and present these numbers again.
>>
>> Here are corrected numbers for a few cases:
>> GPFS (/glade/work on derecho):
>> RESULT: write    SUBSET         1        16        64     4570.2078677815
>>        4.0610844270
>> RESULT: write    SUBSET         1        16        64     4470.3231494386
>>        4.1518251320
>>
>> Lustre, default PFL's:
>> RESULT: write    SUBSET         1        16        64     2808.6570137094
>>        6.6081404420
>> RESULT: write    SUBSET         1        16        64     1025.1671656858
>>       18.1043644600
>>
>> LUSTRE, no PFL's and very wide stripe:
>>  RESULT: write    SUBSET         1        16        64
>> 4687.6852437580        3.9593102000
>>  RESULT: write    SUBSET         1        16        64
>> 3001.4741125579        6.1836282120
>>
>> On Thu, Aug 10, 2023 at 11:34 AM Jim Edwards <jedwards at ucar.edu> wrote:
>>
>>> the stripe settings
>>> lfs setstripe -c 96 -S 128M
>>>
>>> logs/c96_S128M/
>>>
>>>
>>>
>>>
>

-- 
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230811/cd30f03d/attachment-0001.html>


More information about the parallel-netcdf mailing list