performance issue
Jim Edwards
jedwards at ucar.edu
Fri Aug 4 13:25:26 CDT 2023
On my list of things to do in PIO is rewriting the error handling code -
but that issue is the same for both cases and
so I don't think it would play a role in the difference we are seeing.
The lfs getstripe output of the two files is nearly identical, I show only
the cesm file here.
lcm_layout_gen: 7
lcm_mirror_count: 1
lcm_entry_count: 4
lcme_id: 1
lcme_mirror_id: 0
lcme_flags: init
lcme_extent.e_start: 0
lcme_extent.e_end: 16777216
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 27
lmm_objects:
- 0: { l_ost_idx: 27, l_fid: [0xa80000401:0x3f2ef8:0x0] }
lcme_id: 2
lcme_mirror_id: 0
lcme_flags: init
lcme_extent.e_start: 16777216
lcme_extent.e_end: 17179869184
lmm_stripe_count: 4
lmm_stripe_size: 16777216
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 35
lmm_objects:
- 0: { l_ost_idx: 35, l_fid: [0xc80000401:0x3f32e5:0x0] }
- 1: { l_ost_idx: 39, l_fid: [0xcc0000402:0x3f3162:0x0] }
- 2: { l_ost_idx: 43, l_fid: [0xe80000402:0x3f2f3a:0x0] }
- 3: { l_ost_idx: 47, l_fid: [0xec0000401:0x3f3017:0x0] }
lcme_id: 3
lcme_mirror_id: 0
lcme_flags: init
lcme_extent.e_start: 17179869184
lcme_extent.e_end: 68719476736
lmm_stripe_count: 12
lmm_stripe_size: 16777216
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 16
lmm_objects:
- 0: { l_ost_idx: 16, l_fid: [0x700000402:0x4021eb:0x0] }
- 1: { l_ost_idx: 20, l_fid: [0x740000402:0x4020ae:0x0] }
- 2: { l_ost_idx: 24, l_fid: [0x900000400:0x401f68:0x0] }
- 3: { l_ost_idx: 28, l_fid: [0x940000400:0x401f71:0x0] }
- 4: { l_ost_idx: 32, l_fid: [0xb00000402:0x40220c:0x0] }
- 5: { l_ost_idx: 36, l_fid: [0xb40000402:0x40210b:0x0] }
- 6: { l_ost_idx: 40, l_fid: [0xd00000402:0x402141:0x0] }
- 7: { l_ost_idx: 44, l_fid: [0xd40000400:0x401e90:0x0] }
- 8: { l_ost_idx: 48, l_fid: [0xf00000401:0x401e08:0x0] }
- 9: { l_ost_idx: 52, l_fid: [0xf40000400:0x401e32:0x0] }
- 10: { l_ost_idx: 56, l_fid: [0x1100000402:0x4022e0:0x0] }
- 11: { l_ost_idx: 60, l_fid: [0x1140000402:0x4020a6:0x0] }
lcme_id: 4
lcme_mirror_id: 0
lcme_flags: init
lcme_extent.e_start: 68719476736
lcme_extent.e_end: EOF
lmm_stripe_count: 24
lmm_stripe_size: 16777216
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 51
lmm_objects:
- 0: { l_ost_idx: 51, l_fid: [0x1080000400:0x402881:0x0] }
- 1: { l_ost_idx: 55, l_fid: [0x10c0000400:0x402955:0x0] }
- 2: { l_ost_idx: 59, l_fid: [0x1280000400:0x4027d6:0x0] }
- 3: { l_ost_idx: 63, l_fid: [0x12c0000401:0x402ab2:0x0] }
- 4: { l_ost_idx: 67, l_fid: [0x1480000400:0x402b75:0x0] }
- 5: { l_ost_idx: 71, l_fid: [0x14c0000400:0x4028d2:0x0] }
- 6: { l_ost_idx: 75, l_fid: [0x1680000401:0x402a3b:0x0] }
- 7: { l_ost_idx: 79, l_fid: [0x16c0000402:0x40294d:0x0] }
- 8: { l_ost_idx: 83, l_fid: [0x1880000401:0x40299c:0x0] }
- 9: { l_ost_idx: 87, l_fid: [0x18c0000402:0x402f5e:0x0] }
- 10: { l_ost_idx: 91, l_fid: [0x1a80000400:0x402a16:0x0] }
- 11: { l_ost_idx: 95, l_fid: [0x1ac0000400:0x402bd2:0x0] }
- 12: { l_ost_idx: 0, l_fid: [0x300000401:0x402a2e:0x0] }
- 13: { l_ost_idx: 4, l_fid: [0x340000402:0x4027d2:0x0] }
- 14: { l_ost_idx: 8, l_fid: [0x500000402:0x402a26:0x0] }
- 15: { l_ost_idx: 12, l_fid: [0x540000400:0x402943:0x0] }
- 16: { l_ost_idx: 64, l_fid: [0x1300000402:0x402c10:0x0] }
- 17: { l_ost_idx: 68, l_fid: [0x1340000401:0x4029c6:0x0] }
- 18: { l_ost_idx: 72, l_fid: [0x1500000402:0x402d11:0x0] }
- 19: { l_ost_idx: 76, l_fid: [0x1540000402:0x402be2:0x0] }
- 20: { l_ost_idx: 80, l_fid: [0x1700000400:0x402a64:0x0] }
- 21: { l_ost_idx: 84, l_fid: [0x1740000401:0x402b11:0x0] }
- 22: { l_ost_idx: 88, l_fid: [0x1900000402:0x402cb8:0x0] }
- 23: { l_ost_idx: 92, l_fid: [0x1940000400:0x402cd3:0x0] }
On Fri, Aug 4, 2023 at 12:12 PM Wei-Keng Liao <wkliao at northwestern.edu>
wrote:
>
> I can see the file header size of 20620 bytes. Because all attributes are
> stored
> in the header, the cost of writing them should not be an issue. I also see
> no gap
> between 2 consecutive variables, which is good, meaning the write requests
> made
> by MPI-IO will be contiguous.
>
> If the call sequence of PnetCDF APIs is the same between pioperf and cesm,
> then
> the performance should be similarly. Can you check the Lustre striping
> settings
> of the 2 output files, using command "lfs getstripe"?
>
> If you set any MPI-IO hints, they can also play a role in performance.
> See the example in PnetCDF for how to dump all hints (function
> print_info().)
>
> https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/get_info.c
>
> If all the above checked out right, then using Darshan should reveal more
> information.
>
> BTW, what PnetCDF version is being used?
>
> A comment about PIOc_put_att_tc.
> * calling MPI_Bcast for checking the error code may not be necessary.
> PnetCDF does such
> check and all metadata consistency check at ncmpi_enddef. If the number
> of variables
> and their attributes are high, then calling lots of MPI_Bcast can be
> expensive.
>
> https://github.com/NCAR/ParallelIO/blob/f45ba898bec31e6cd662ac41f43e0cff14f928b2/src/clib/pio_getput_int.c#L213
>
>
> Wei-keng
>
> On Aug 4, 2023, at 12:32 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>
> Yes, _enddef is called only once.
>
> Here
> <https://urldefense.com/v3/__https://github.com/NCAR/ParallelIO/blob/main/src/clib/pio_getput_int.c*L128__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hw4F6HzDo$> is
> the code that writes attributes. Here
> <https://urldefense.com/v3/__https://github.com/NCAR/ParallelIO/blob/main/src/clib/pio_darray_int.c*L661__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwOC-tyLY$> is
> where variables are written.
>
> ncoffsets -sg pioperf.2-0256-1.nc
> <https://urldefense.com/v3/__http://pioperf.2-0256-1.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwTlZDyu8$>
> netcdf pioperf.2-0256-1.nc
> <https://urldefense.com/v3/__http://pioperf.2-0256-1.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwTlZDyu8$>
> {
> // file format: CDF-5
>
> file header:
> size = 7804 bytes
> extent = 8192 bytes
>
> dimensions:
> dim000001 = 10485762
> dim000002 = 58
> time = UNLIMITED // (1 currently)
>
> record variables:
> double vard0001(time, dim000002, dim000001):
> start file offset = 8192 (0th record)
> end file offset = 4865401760 (0th record)
> size in bytes = 4865393568 (of one record)
> gap from prev var = 388
> double vard0002(time, dim000002, dim000001):
> start file offset = 4865401760 (0th record)
> end file offset = 9730795328 (0th record)
> size in bytes = 4865393568 (of one record)
> gap from prev var = 0
>
>
> snip
>
> double vard0064(time, dim000002, dim000001):
> start file offset =306519802976 (0th record)
> end file offset =311385196544 (0th record)
> size in bytes = 4865393568 (of one record)
> gap from prev var = 0
> }
>
> ncoffsets -sg run/
> SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc
> <https://urldefense.com/v3/__http://SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwdkm0FHQ$>
> netcdf run/
> SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc
> <https://urldefense.com/v3/__http://SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwdkm0FHQ$>
> {
> // file format: CDF-5
>
> file header:
> size = 20620 bytes
> extent = 16777216 bytes
>
> dimensions:
> ncol = 10485762
> time = UNLIMITED // (1 currently)
> nbnd = 2
> chars = 8
> lev = 58
> ilev = 59
>
> fixed-size variables:
> double lat(ncol):
> start file offset = 16777216
> end file offset = 100663312
> size in bytes = 83886096
> gap from prev var = 16756596
> double lon(ncol):
> start file offset = 100663312
> end file offset = 184549408
> size in bytes = 83886096
> gap from prev var = 0
>
>
> snip
>
> int mdt:
> start file offset = 352322552
> end file offset = 352322556
> size in bytes = 4
> gap from prev var = 0
>
> record variables:
> double time(time):
> start file offset = 352322556 (0th record)
> end file offset = 352322564 (0th record)
> size in bytes = 8 (of one record)
> gap from prev var = 0
> int date(time):
> start file offset = 352322564 (0th record)
> end file offset = 352322568 (0th record)
> size in bytes = 4 (of one record)
> gap from prev var = 0
>
>
> snip
>
> double STEND_CLUBB(time, lev, ncol):
> start file offset =306872117448 (0th record)
> end file offset =311737511016 (0th record)
> size in bytes = 4865393568 (of one record)
> gap from prev var = 0
> }
>
> On Fri, Aug 4, 2023 at 10:35 AM Wei-Keng Liao <wkliao at northwestern.edu>
> wrote:
>
>> Can you run command "ncoffsets -sg file.nc
>> <https://urldefense.com/v3/__http://file.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwCtAtD9k$>"
>> that shows the sizes of file header
>> and all variables? For the cesm case, is _enddef called only once?
>>
>> Could you also point me to the program files that call PnetCDF APIs,
>> including
>> writing attributes and variables?
>>
>>
>> Wei-keng
>>
>> On Aug 4, 2023, at 11:05 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>>
>> I am using the new ncar system, derecho
>> <https://urldefense.com/v3/__https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwigtcytrsOAAxWXHjQIHVoDD6YQFnoECBcQAQ&url=https*3A*2F*2Farc.ucar.edu*2Fknowledge_base*2F74317833&usg=AOvVaw2aXlWuOfLnua7fFmIgvfoV&opi=89978449__;JSUlJSU!!Dq0X2DkFhyF93HkjWTBQKhk!Xq6u5krREolkIRHG8AL2taDCmg6HsEdgcEoviUVyzqUINi-ipPM1EhtMcJkQfUYghDhutn7DfH5Wjm57wJ9lQhc$>,
>> which has a lustre parallel file system.
>>
>> Looking at the difference between the two headers below makes me wonder
>> if the issue is with variable attributes?
>>
>>
>> snip
>>
>>
>> On Fri, Aug 4, 2023 at 9:39 AM Wei-Keng Liao <wkliao at northwestern.edu>
>> wrote:
>>
>>> Hi, Jim
>>>
>>> Can your provide the test program and the file header dumped by "ncdump
>>> -h", if that is available?
>>> Also, what machine was used in the tests and its the parallel file
>>> system configuration is?
>>> These can help diagnose.
>>>
>>> Wei-keng
>>>
>>> On Aug 4, 2023, at 8:49 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>
>>> I am using ncmpi_iput_varn and ncmpi_wait_all to write output from my
>>> model. I have a test program that does nothing but test the
>>> performance of the write operation. Attached is a plot of performance
>>> in the model and in the standalone application. I'm looking for
>>> clues as to why the model performance is scaling so badly with the
>>> number of variables but the standalone program performance is fine.
>>>
>>>
>>>
>>> --
>>> Jim Edwards
>>>
>>> CESM Software Engineer
>>> National Center for Atmospheric Research
>>> Boulder, CO
>>> <Screenshot 2023-07-27 at 11.49.03 AM.png>
>>>
>>>
>>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineer
>> National Center for Atmospheric Research
>> Boulder, CO
>>
>>
>>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>
>
>
--
Jim Edwards
CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230804/a78c33a5/attachment-0001.html>
More information about the parallel-netcdf
mailing list