performance issue

Jim Edwards jedwards at ucar.edu
Fri Aug 11 16:10:27 CDT 2023


Hi Wei-Keng,

Sorry about the miscommunication earlier today - I just wanted to confirm
that you've been able to reproduce the issue now?

On Fri, Aug 11, 2023 at 1:01 PM Jim Edwards <jedwards at ucar.edu> wrote:

> I'm sorry - I thought that I had provided that, but I guess not.
> repo: git at github.com:jedwards4b/ParallelIO.git
> branch: bugtest/lustre
>
> On Fri, Aug 11, 2023 at 12:46 PM Wei-Keng Liao <wkliao at northwestern.edu>
> wrote:
>
>> Any particular github branch I should use?
>>
>> I got an error during make.
>> /global/homes/w/wkliao/PIO/Github/ParallelIO/src/clib/pio_nc4.c:1481:18:
>> error: call to undeclared function 'nc_inq_var_filter_ids'; ISO C99 and
>> later do not support implicit function declarations
>> [-Wimplicit-function-declaration]
>>           ierr = nc_inq_var_filter_ids(file->fh, varid, nfiltersp, ids);
>>                  ^
>>
>>
>> Setting these 2 does not help.
>> #undef NC_HAS_ZSTD
>> #undef NC_HAS_BZ2
>>
>>
>> Wei-keng
>>
>> > On Aug 11, 2023, at 12:44 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>> >
>> > I see I missed answering one question - total 2048 tasks.  (16 nodes)
>> >
>> > On Fri, Aug 11, 2023 at 11:35 AM Jim Edwards <jedwards at ucar.edu> wrote:
>> > Here is my run script on perlmutter:
>> >
>> > #!/usr/bin/env python
>> > #
>> > #SBATCH -A mp9
>> > #SBATCH -C cpu
>> > #SBATCH --qos=regular
>> > #SBATCH --time=15
>> > #SBATCH --nodes=16
>> > #SBATCH --ntasks-per-node=128
>> >
>> > import os
>> > import glob
>> >
>> > with open("pioperf.nl","w") as fd:
>> >     fd.write("&pioperf\n")
>> >     fd.write("  decompfile='ROUNDROBIN'\n")
>> > #    for filename in decompfiles:
>> > #        fd.write("   '"+filename+"',\n")
>> >     fd.write(" varsize=18560\n");
>> >     fd.write(" pio_typenames = 'pnetcdf','pnetcdf'\n");
>> >     fd.write(" rearrangers = 2\n");
>> >     fd.write(" nframes = 1\n");
>> >     fd.write(" nvars = 64\n");
>> >     fd.write(" niotasks = 16\n");
>> >     fd.write(" /\n")
>> >
>> > os.system("srun -n 2048 ~/parallelio/bld/tests/performance/pioperf ")
>> >
>> >
>> > Module environment:
>> > Currently Loaded Modules:
>> >   1) craype-x86-milan                        6) cpe/23.03
>>   11) craype-accel-nvidia80        16) craype/2.7.20          21)
>> cmake/3.24.3
>> >   2) libfabric/1.15.2.0                      7) xalt/2.10.2
>>   12) gpu/1.0                      17) cray-dsmml/0.2.2       22)
>> cray-parallel-netcdf/1.12.3.3
>> >   3) craype-network-ofi                      8)
>> Nsight-Compute/2022.1.1  13) evp-patch                    18)
>> cray-mpich/8.1.25      23) cray-hdf5/1.12.2.3
>> >   4) xpmem/2.5.2-2.4_3.49__gd0f7936.shasta   9)
>> Nsight-Systems/2022.2.1  14) python/3.9-anaconda-2021.11  19) cray-libsci/
>> 23.02.1.1  24) cray-netcdf/4.9.0.3
>> >   5) perftools-base/23.03.0                 10) cudatoolkit/11.7
>>  15) intel/2023.1.0               20) PrgEnv-intel/8.3.3
>> >
>> > cmake command:
>> >  CC=mpicc FC=mpifort cmake
>> -DPNETCDF_DIR=$CRAY_PARALLEL_NETCDF_DIR/intel/19.0
>> -DNETCDF_DIR=$CRAY_NETCDF_PREFIX -DHAVE_PAR_FILTERS=OFF ../
>> >
>> > There are a couple of issues with the build that can be fixed by
>> editing file config.h (created in the bld directory by cmake)
>> >
>> > Add the following to config.h:
>> >
>> > #undef NC_HAS_ZSTD
>> > #undef NC_HAS_BZ2
>> >
>> > then:
>> > make pioperf
>> >
>> > once it's built run the submit script from $SCRATCH
>> >
>> > On Fri, Aug 11, 2023 at 11:13 AM Wei-Keng Liao <wkliao at northwestern.edu>
>> wrote:
>> > OK. I will test it myself on Perlmutter.
>> > Do you have a small test program to reproduce or is it still pioperf?
>> > If pioperf, are the build instructions on Perlmutter the same?
>> >
>> > Please let me know how you run on Perlmutter, i.e. no. process, nodes,
>> > Lustre striping, problem size, etc.
>> >
>> > Does "1 16 64" in your results mean 16 I/O tasks and 64 variables,
>> > yes this is correct
>> >
>> >   and only 16 MPI processes out of total ? processes call PnetCDF APIs?
>> >
>> > yes this is also correct.
>> >
>> >   Wei-keng
>> >
>> >> On Aug 11, 2023, at 9:35 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>> >>
>> >> I tried on perlmutter and am seeing the same issue only maybe even
>> worse:
>> >>
>> >> RESULT: write    SUBSET         1        16        64
>>  1261.0737058071       14.7176171500
>> >> RESULT: write    SUBSET         1        16        64
>>  90.3736534450      205.3695882870
>> >>
>> >>
>> >> On Fri, Aug 11, 2023 at 8:17 AM Jim Edwards <jedwards at ucar.edu> wrote:
>> >> Hi Wei-Keng,
>> >>
>> >> I released that the numbers in this table are all showing the slow
>> performing file and the fast file
>> >> (the one without the scalar variable) are not represented - I will
>> rerun and present these numbers again.
>> >>
>> >> Here are corrected numbers for a few cases:
>> >> GPFS (/glade/work on derecho):
>> >> RESULT: write    SUBSET         1        16        64
>>  4570.2078677815        4.0610844270
>> >> RESULT: write    SUBSET         1        16        64
>>  4470.3231494386        4.1518251320
>> >>
>> >> Lustre, default PFL's:
>> >> RESULT: write    SUBSET         1        16        64
>>  2808.6570137094        6.6081404420
>> >> RESULT: write    SUBSET         1        16        64
>>  1025.1671656858       18.1043644600
>> >>
>> >> LUSTRE, no PFL's and very wide stripe:
>> >>  RESULT: write    SUBSET         1        16        64
>>  4687.6852437580        3.9593102000
>> >>  RESULT: write    SUBSET         1        16        64
>>  3001.4741125579        6.1836282120
>> >>
>> >> On Thu, Aug 10, 2023 at 11:34 AM Jim Edwards <jedwards at ucar.edu>
>> wrote:
>> >> the stripe settings
>> >> lfs setstripe -c 96 -S 128M
>> >> logs/c96_S128M/
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Jim Edwards
>> >
>> > CESM Software Engineer
>> > National Center for Atmospheric Research
>> > Boulder, CO
>> >
>> >
>> > --
>> > Jim Edwards
>> >
>> > CESM Software Engineer
>> > National Center for Atmospheric Research
>> > Boulder, CO
>>
>>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>


-- 
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230811/32518606/attachment-0001.html>


More information about the parallel-netcdf mailing list