performance issue

Jim Edwards jedwards at ucar.edu
Wed Aug 9 17:16:11 CDT 2023


The cb_nodes was different between the test program and the cesm but I was
able
to figure out what the issue was and reproduce it in the test program so
now cb_nodes is the same for
both files - both are being written by the test program and the only
difference between them is that
the slow one has one additional variable which is a scalar and is not a
record variable.

On Wed, Aug 9, 2023 at 3:57 PM Wei-Keng Liao <wkliao at northwestern.edu>
wrote:

> I thoughts the cb_nodes values are different between the two runs, based
> on one of your earlier emails. Can you try the example C program I modified
> to include scalar and record variables? It reports timings and cb_nodes
> value.
>
>
> Wei-keng
>
> On Aug 9, 2023, at 4:22 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>
> The only difference in the two files is the addition of a single scalar
> string variable.
> Why would that significantly change this?
>
> I changed the striping on the directory using lfs setstripe -c -1
> I did this because it exaggerates the performance difference.
> lmm_stripe_count:  96
> lmm_stripe_size:   1048576
> lmm_pattern:       raid0
> lmm_layout_gen:    0
> lmm_stripe_offset: 6
>
> The original problem was on 32786 tasks - I can now see it on 2048 tasks.
>
> On Wed, Aug 9, 2023 at 3:13 PM Wei-Keng Liao <wkliao at northwestern.edu>
> wrote:
>
>> Googling gave me this.
>> "The number of the gaps and the size is related to how many seek
>> operation happens and how much is the size of the file in bytes that is
>> skipped to write the next part."
>>
>>
>> Are you still using the default file striping settings?
>>
>> Wei-keng
>>
>> On Aug 9, 2023, at 3:51 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>
>> I spent a little time trying to do this but gave up and went back to
>> using cray profiling tools to get more info.
>> One thing really stands out to me:
>>
>> This is for the fast write:
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: | number of write gaps = 2
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: | ave write gap size = 9722924978
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: --------------------------------------------------------
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: RESULT: write SUBSET 1 16 64 4060.0217755460 4.5714040530
>>
>> And this is for the slow one:
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: | number of write gaps = 1020
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: | ave write gap size = 19079761
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: --------------------------------------------------------
>> dec1793.hsn.de.hpc.ucar.edu
>> <https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$>
>> 0: RESULT: write SUBSET 1 16 64 76.2558020443 243.3913158400
>>
>>
>> Do you understand?
>>
>> On Tue, Aug 8, 2023 at 11:50 AM Wei-Keng Liao <wkliao at northwestern.edu>
>> wrote:
>>
>>> I have revised the example program to add writes to scalar and record
>>> variables.
>>> Let me know if that works for you. URL again is below.
>>>
>>>
>>> https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c
>>> <https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-e9eaX1iY$>
>>>
>>> Wei-keng
>>>
>>> On Aug 7, 2023, at 6:10 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>
>>> That example doesn't include record variables.  Do you have a similar
>>> one with record vars?
>>>
>>>
>>>
>>> On Mon, Aug 7, 2023 at 4:32 PM Wei-Keng Liao <wkliao at northwestern.edu>
>>> wrote:
>>>
>>>> Hi, Jim
>>>>
>>>> To eliminate the overheads of PIO, I suggest to use this PnetCDF
>>>> example program
>>>> and add a scalar variable to see if the same happens.
>>>>
>>>>
>>>> https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c
>>>> <https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!RGlLkVUbuYrrGrSkShv42nz4KqtPJK0FiNzPuYKV-esdwU5UcgKr0xLvQpOooAfY4n2UMB8meSG2ZanhcYgGU_Q$>
>>>>
>>>> Wei-keng
>>>>
>>>> On Aug 7, 2023, at 4:28 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>>
>>>> Hi Wei-Keng,
>>>>
>>>> The cb_nodes doesn't seem to be affected.
>>>>
>>>> Not using independent mode doesn't seem to have helped.  I have the
>>>> pioperf program now writing two files.  One with only
>>>> decomposed fields and one with one additional field, rundate, which is
>>>> a string with the date in it.
>>>>
>>>> The performance is drastically different:
>>>>                                                          IO tasks
>>>> vars      Mb/s                           Time (s)
>>>>  RESULT: write    SUBSET         1       256        64
>>>>  12067.7548254854       25.1577347560    (without scalar)
>>>>  RESULT: write    SUBSET         1       256        64
>>>>  286.4615089145     1059.8190875640      (with scalar)
>>>>
>>>>
>>>> On Mon, Aug 7, 2023 at 1:47 PM Wei-Keng Liao <wkliao at northwestern.edu>
>>>> wrote:
>>>>
>>>>> Is that the reason for why cb_nodes is 1?
>>>>> Strange, because cb_nodes is set at the file open time.
>>>>>
>>>>> Entering the independent data mode in PnetCDF can be completely avoided
>>>>> if using the nonblocking APIs.
>>>>>
>>>>> I would suggest your codes to use the nonblocking APIs in the
>>>>> following way.
>>>>>
>>>>> /* for non-partitioned variables */
>>>>> if (rank == 0) {
>>>>>     ncmpi_iput_var_int(fh, varid[0], data[0], &req[0]); /* write the
>>>>> whole variable */
>>>>>     ncmpi_iput_var_int(fh, varid[1], data[1], &req[1]);
>>>>>     ...
>>>>> }
>>>>> /* for partitioned variables */
>>>>> ncmpi_iput_vara_int(fh, varid[j], data[j], starts[j], counts[j],
>>>>> &req[j]);
>>>>> ...
>>>>>
>>>>>
>>>>> /* commit all posted nonblocking requests */
>>>>> ncmpi_wait_all(ncid, NC_REQ_ALL, NC_REQ_NULL, NULL);
>>>>>
>>>>>
>>>>> Wei-keng
>>>>>
>>>>> > On Aug 7, 2023, at 2:12 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>>> >
>>>>> > Hi Wei-Keng,
>>>>> >
>>>>> > I think that I've found the problem.   In the model I am writing a
>>>>> number of scalar variables to the file as well as the decomposed variables.
>>>>> > for the scalar variables I use a code structure like:
>>>>> >
>>>>> > ncmpi_begin_indep_data(fh);
>>>>> > ncmpi_put_vars_int(fh, varid, start, count, stride, data);
>>>>> > ncmpi_end_indep_data(fh);
>>>>> >
>>>>> > In my pioperf test code I didn't write any scalars - this morning I
>>>>> added one and the write performance for the decomposed variables got very
>>>>> very
>>>>> > bad.  What can I do about it?
>>>>> >
>>>>> > Jim
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Jim Edwards
>>>>> >
>>>>> > CESM Software Engineer
>>>>> > National Center for Atmospheric Research
>>>>> > Boulder, CO
>>>>>
>>>>>
>>>>
>>>> --
>>>> Jim Edwards
>>>>
>>>> CESM Software Engineer
>>>> National Center for Atmospheric Research
>>>> Boulder, CO
>>>>
>>>>
>>>>
>>>
>>> --
>>> Jim Edwards
>>>
>>> CESM Software Engineer
>>> National Center for Atmospheric Research
>>> Boulder, CO
>>>
>>>
>>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineer
>> National Center for Atmospheric Research
>> Boulder, CO
>>
>>
>>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>
>
>

-- 
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230809/b9c3c5fa/attachment.html>


More information about the parallel-netcdf mailing list