performance issue

Wei-Keng Liao wkliao at northwestern.edu
Fri Aug 11 17:54:23 CDT 2023


Hi, Jim

Can you please describe the data partitioning pattern used in pioperf?

Wei-keng

On Aug 11, 2023, at 5:46 PM, Jim Edwards <jedwards at ucar.edu> wrote:

Yes that line is called by all processes, but it in turn calls into pio_getput_int.c line 1159

   ierr = ncmpi_bput_vars_text(file->fh, varid, start, count, fake_stride, buf, request);

which is called only by MPI_ROOT  (line 1088 of the same file).

On Fri, Aug 11, 2023 at 4:41 PM Wei-Keng Liao <wkliao at northwestern.edu<mailto:wkliao at northwestern.edu>> wrote:
I can see line 344 of pioperformance.F90 is called by all processes.

                   nvarmult= pio_put_var(File, rundate, date//' '//time(1:4))

How do I change it, so it is called by rank 0 only?


Wei-keng

On Aug 11, 2023, at 4:25 PM, Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:

Yes - src/clib/pio_darray_int.c and src/clib/pio_getput_int.c

I've also attached a couple of darshan dxt profiles.  Both use the same
lustre file parameters, the first (dxt1.out) is the fast one without the scalar write.
dxt2.out is the slow one.    It seems like adding the scalar is causing all of the other writes to get broken up into smaller bits.
I've also tried moving around where the scalar variable is defined and written with respect to the record variables - that doesn't seem to make any difference.

On Fri, Aug 11, 2023 at 3:21 PM Wei-Keng Liao <wkliao at northwestern.edu<mailto:wkliao at northwestern.edu>> wrote:
Yes, I have.

Can you let me know the source codes files that make the PnetCDF API calls?


Wei-keng

On Aug 11, 2023, at 4:10 PM, Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:

Hi Wei-Keng,

Sorry about the miscommunication earlier today - I just wanted to confirm that you've been able to reproduce the issue now?

On Fri, Aug 11, 2023 at 1:01 PM Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
I'm sorry - I thought that I had provided that, but I guess not.
repo: git at github.com:jedwards4b/ParallelIO.git
branch: bugtest/lustre

On Fri, Aug 11, 2023 at 12:46 PM Wei-Keng Liao <wkliao at northwestern.edu<mailto:wkliao at northwestern.edu>> wrote:
Any particular github branch I should use?

I got an error during make.
/global/homes/w/wkliao/PIO/Github/ParallelIO/src/clib/pio_nc4.c:1481:18: error: call to undeclared function 'nc_inq_var_filter_ids'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
          ierr = nc_inq_var_filter_ids(file->fh, varid, nfiltersp, ids);
                 ^


Setting these 2 does not help.
#undef NC_HAS_ZSTD
#undef NC_HAS_BZ2


Wei-keng

> On Aug 11, 2023, at 12:44 PM, Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
>
> I see I missed answering one question - total 2048 tasks.  (16 nodes)
>
> On Fri, Aug 11, 2023 at 11:35 AM Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
> Here is my run script on perlmutter:
>
> #!/usr/bin/env python
> #
> #SBATCH -A mp9
> #SBATCH -C cpu
> #SBATCH --qos=regular
> #SBATCH --time=15
> #SBATCH --nodes=16
> #SBATCH --ntasks-per-node=128
>
> import os
> import glob
>
> with open("pioperf.nl<https://urldefense.com/v3/__http://pioperf.nl__;!!Dq0X2DkFhyF93HkjWTBQKhk!Wb6cQ1sLURpN2Ny6hale9LwE7W4NZTmSs0o72VGzPYA0zKG52eGy3PPukBWcmzrlC-J0mML7UGxclZF0OT7T_80$>","w") as fd:
>     fd.write("&pioperf\n")
>     fd.write("  decompfile='ROUNDROBIN'\n")
> #    for filename in decompfiles:
> #        fd.write("   '"+filename+"',\n")
>     fd.write(" varsize=18560\n");
>     fd.write(" pio_typenames = 'pnetcdf','pnetcdf'\n");
>     fd.write(" rearrangers = 2\n");
>     fd.write(" nframes = 1\n");
>     fd.write(" nvars = 64\n");
>     fd.write(" niotasks = 16\n");
>     fd.write(" /\n")
>
> os.system("srun -n 2048 ~/parallelio/bld/tests/performance/pioperf ")
>
>
> Module environment:
> Currently Loaded Modules:
>   1) craype-x86-milan                        6) cpe/23.03                11) craype-accel-nvidia80        16) craype/2.7.20          21) cmake/3.24.3
>   2) libfabric/1.15.2.0<https://urldefense.com/v3/__http://1.15.2.0__;!!Dq0X2DkFhyF93HkjWTBQKhk!Wb6cQ1sLURpN2Ny6hale9LwE7W4NZTmSs0o72VGzPYA0zKG52eGy3PPukBWcmzrlC-J0mML7UGxclZF09lbxLQc$>                      7) xalt/2.10.2              12) gpu/1.0                      17) cray-dsmml/0.2.2       22) cray-parallel-netcdf/1.12.3.3<https://urldefense.com/v3/__http://1.12.3.3__;!!Dq0X2DkFhyF93HkjWTBQKhk!Wb6cQ1sLURpN2Ny6hale9LwE7W4NZTmSs0o72VGzPYA0zKG52eGy3PPukBWcmzrlC-J0mML7UGxclZF0srcAuMk$>
>   3) craype-network-ofi                      8) Nsight-Compute/2022.1.1  13) evp-patch                    18) cray-mpich/8.1.25      23) cray-hdf5/1.12.2.3<https://urldefense.com/v3/__http://1.12.2.3__;!!Dq0X2DkFhyF93HkjWTBQKhk!Wb6cQ1sLURpN2Ny6hale9LwE7W4NZTmSs0o72VGzPYA0zKG52eGy3PPukBWcmzrlC-J0mML7UGxclZF0dMhur8g$>
>   4) xpmem/2.5.2-2.4_3.49__gd0f7936.shasta   9) Nsight-Systems/2022.2.1  14) python/3.9-anaconda-2021.11  19) cray-libsci/23.02.1.1<https://urldefense.com/v3/__http://23.02.1.1__;!!Dq0X2DkFhyF93HkjWTBQKhk!Wb6cQ1sLURpN2Ny6hale9LwE7W4NZTmSs0o72VGzPYA0zKG52eGy3PPukBWcmzrlC-J0mML7UGxclZF0iCkdumc$>  24) cray-netcdf/4.9.0.3<https://urldefense.com/v3/__http://4.9.0.3__;!!Dq0X2DkFhyF93HkjWTBQKhk!Wb6cQ1sLURpN2Ny6hale9LwE7W4NZTmSs0o72VGzPYA0zKG52eGy3PPukBWcmzrlC-J0mML7UGxclZF0K5aEvpc$>
>   5) perftools-base/23.03.0                 10) cudatoolkit/11.7         15) intel/2023.1.0               20) PrgEnv-intel/8.3.3
>
> cmake command:
>  CC=mpicc FC=mpifort cmake -DPNETCDF_DIR=$CRAY_PARALLEL_NETCDF_DIR/intel/19.0 -DNETCDF_DIR=$CRAY_NETCDF_PREFIX -DHAVE_PAR_FILTERS=OFF ../
>
> There are a couple of issues with the build that can be fixed by editing file config.h (created in the bld directory by cmake)
>
> Add the following to config.h:
>
> #undef NC_HAS_ZSTD
> #undef NC_HAS_BZ2
>
> then:
> make pioperf
>
> once it's built run the submit script from $SCRATCH
>
> On Fri, Aug 11, 2023 at 11:13 AM Wei-Keng Liao <wkliao at northwestern.edu<mailto:wkliao at northwestern.edu>> wrote:
> OK. I will test it myself on Perlmutter.
> Do you have a small test program to reproduce or is it still pioperf?
> If pioperf, are the build instructions on Perlmutter the same?
>
> Please let me know how you run on Perlmutter, i.e. no. process, nodes,
> Lustre striping, problem size, etc.
>
> Does "1 16 64" in your results mean 16 I/O tasks and 64 variables,
> yes this is correct
>
>   and only 16 MPI processes out of total ? processes call PnetCDF APIs?
>
> yes this is also correct.
>
>   Wei-keng
>
>> On Aug 11, 2023, at 9:35 AM, Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
>>
>> I tried on perlmutter and am seeing the same issue only maybe even worse:
>>
>> RESULT: write    SUBSET         1        16        64     1261.0737058071       14.7176171500
>> RESULT: write    SUBSET         1        16        64       90.3736534450      205.3695882870
>>
>>
>> On Fri, Aug 11, 2023 at 8:17 AM Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
>> Hi Wei-Keng,
>>
>> I released that the numbers in this table are all showing the slow performing file and the fast file
>> (the one without the scalar variable) are not represented - I will rerun and present these numbers again.
>>
>> Here are corrected numbers for a few cases:
>> GPFS (/glade/work on derecho):
>> RESULT: write    SUBSET         1        16        64     4570.2078677815        4.0610844270
>> RESULT: write    SUBSET         1        16        64     4470.3231494386        4.1518251320
>>
>> Lustre, default PFL's:
>> RESULT: write    SUBSET         1        16        64     2808.6570137094        6.6081404420
>> RESULT: write    SUBSET         1        16        64     1025.1671656858       18.1043644600
>>
>> LUSTRE, no PFL's and very wide stripe:
>>  RESULT: write    SUBSET         1        16        64     4687.6852437580        3.9593102000
>>  RESULT: write    SUBSET         1        16        64     3001.4741125579        6.1836282120
>>
>> On Thu, Aug 10, 2023 at 11:34 AM Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
>> the stripe settings
>> lfs setstripe -c 96 -S 128M
>> logs/c96_S128M/
>>
>>
>
>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO



--
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO


--
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO



--
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
<dxt2.out><dxt1.out>



--
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230811/f3ee288d/attachment-0001.html>


More information about the parallel-netcdf mailing list