performance issue

Jim Edwards jedwards at ucar.edu
Wed Aug 9 16:22:41 CDT 2023


The only difference in the two files is the addition of a single scalar
string variable.
Why would that significantly change this?

I changed the striping on the directory using lfs setstripe -c -1
I did this because it exaggerates the performance difference.
lmm_stripe_count:  96
lmm_stripe_size:   1048576
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 6

The original problem was on 32786 tasks - I can now see it on 2048 tasks.

On Wed, Aug 9, 2023 at 3:13 PM Wei-Keng Liao <wkliao at northwestern.edu>
wrote:

> Googling gave me this.
> "The number of the gaps and the size is related to how many seek operation
> happens and how much is the size of the file in bytes that is skipped to
> write the next part."
>
>
> Are you still using the default file striping settings?
>
> Wei-keng
>
> On Aug 9, 2023, at 3:51 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>
> I spent a little time trying to do this but gave up and went back to using
> cray profiling tools to get more info.
> One thing really stands out to me:
>
> This is for the fast write:
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: | number of write gaps = 2
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: | ave write gap size = 9722924978
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: --------------------------------------------------------
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: RESULT: write SUBSET 1 16 64 4060.0217755460 4.5714040530
>
> And this is for the slow one:
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: | number of write gaps = 1020
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: | ave write gap size = 19079761
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: --------------------------------------------------------
> dec1793.hsn.de.hpc.ucar.edu
> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
> 0: RESULT: write SUBSET 1 16 64 76.2558020443 243.3913158400
>
>
> Do you understand?
>
> On Tue, Aug 8, 2023 at 11:50 AM Wei-Keng Liao <wkliao at northwestern.edu>
> wrote:
>
>> I have revised the example program to add writes to scalar and record
>> variables.
>> Let me know if that works for you. URL again is below.
>>
>>
>> https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c
>> <https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-e9eaX1iY$>
>>
>> Wei-keng
>>
>> On Aug 7, 2023, at 6:10 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>
>> That example doesn't include record variables.  Do you have a similar one
>> with record vars?
>>
>>
>>
>> On Mon, Aug 7, 2023 at 4:32 PM Wei-Keng Liao <wkliao at northwestern.edu>
>> wrote:
>>
>>> Hi, Jim
>>>
>>> To eliminate the overheads of PIO, I suggest to use this PnetCDF example
>>> program
>>> and add a scalar variable to see if the same happens.
>>>
>>>
>>> https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c
>>> <https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!RGlLkVUbuYrrGrSkShv42nz4KqtPJK0FiNzPuYKV-esdwU5UcgKr0xLvQpOooAfY4n2UMB8meSG2ZanhcYgGU_Q$>
>>>
>>> Wei-keng
>>>
>>> On Aug 7, 2023, at 4:28 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>
>>> Hi Wei-Keng,
>>>
>>> The cb_nodes doesn't seem to be affected.
>>>
>>> Not using independent mode doesn't seem to have helped.  I have the
>>> pioperf program now writing two files.  One with only
>>> decomposed fields and one with one additional field, rundate, which is a
>>> string with the date in it.
>>>
>>> The performance is drastically different:
>>>                                                          IO tasks
>>> vars      Mb/s                           Time (s)
>>>  RESULT: write    SUBSET         1       256        64
>>>  12067.7548254854       25.1577347560    (without scalar)
>>>  RESULT: write    SUBSET         1       256        64
>>>  286.4615089145     1059.8190875640      (with scalar)
>>>
>>>
>>> On Mon, Aug 7, 2023 at 1:47 PM Wei-Keng Liao <wkliao at northwestern.edu>
>>> wrote:
>>>
>>>> Is that the reason for why cb_nodes is 1?
>>>> Strange, because cb_nodes is set at the file open time.
>>>>
>>>> Entering the independent data mode in PnetCDF can be completely avoided
>>>> if using the nonblocking APIs.
>>>>
>>>> I would suggest your codes to use the nonblocking APIs in the following
>>>> way.
>>>>
>>>> /* for non-partitioned variables */
>>>> if (rank == 0) {
>>>>     ncmpi_iput_var_int(fh, varid[0], data[0], &req[0]); /* write the
>>>> whole variable */
>>>>     ncmpi_iput_var_int(fh, varid[1], data[1], &req[1]);
>>>>     ...
>>>> }
>>>> /* for partitioned variables */
>>>> ncmpi_iput_vara_int(fh, varid[j], data[j], starts[j], counts[j],
>>>> &req[j]);
>>>> ...
>>>>
>>>>
>>>> /* commit all posted nonblocking requests */
>>>> ncmpi_wait_all(ncid, NC_REQ_ALL, NC_REQ_NULL, NULL);
>>>>
>>>>
>>>> Wei-keng
>>>>
>>>> > On Aug 7, 2023, at 2:12 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>> >
>>>> > Hi Wei-Keng,
>>>> >
>>>> > I think that I've found the problem.   In the model I am writing a
>>>> number of scalar variables to the file as well as the decomposed variables.
>>>> > for the scalar variables I use a code structure like:
>>>> >
>>>> > ncmpi_begin_indep_data(fh);
>>>> > ncmpi_put_vars_int(fh, varid, start, count, stride, data);
>>>> > ncmpi_end_indep_data(fh);
>>>> >
>>>> > In my pioperf test code I didn't write any scalars - this morning I
>>>> added one and the write performance for the decomposed variables got very
>>>> very
>>>> > bad.  What can I do about it?
>>>> >
>>>> > Jim
>>>> >
>>>> >
>>>> > --
>>>> > Jim Edwards
>>>> >
>>>> > CESM Software Engineer
>>>> > National Center for Atmospheric Research
>>>> > Boulder, CO
>>>>
>>>>
>>>
>>> --
>>> Jim Edwards
>>>
>>> CESM Software Engineer
>>> National Center for Atmospheric Research
>>> Boulder, CO
>>>
>>>
>>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineer
>> National Center for Atmospheric Research
>> Boulder, CO
>>
>>
>>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>
>
>

-- 
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230809/f4a0e4f1/attachment.html>


More information about the parallel-netcdf mailing list