performance issue

Jim Edwards jedwards at ucar.edu
Fri Aug 4 14:43:43 CDT 2023


Here is the info for the standalone pioperf program:

dec1281.hsn.de.hpc.ucar.edu 0: PE 0: MPICH MPIIO environment settings
===============================
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_HINTS_DISPLAY
           = 1
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_HINTS
           = NULL
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_ABORT_ON_RW_ERROR
           = disable
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_CB_ALIGN
          = 2
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_DVS_MAXNODES
          = -1
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:
MPICH_MPIIO_AGGREGATOR_PLACEMENT_DISPLAY       = 0
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:
MPICH_MPIIO_AGGREGATOR_PLACEMENT_STRIDE        = -1
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_MAX_NUM_IRECV
           = 50
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_MAX_NUM_ISEND
           = 50
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_MAX_SIZE_ISEND
          = 10485760
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_OFI_STARTUP_CONNECT
           = disable
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:
MPICH_MPIIO_OFI_STARTUP_NODES_AGGREGATOR        = 2
dec1281.hsn.de.hpc.ucar.edu 0: PE 0: MPICH MPIIO statistics environment
settings ====================
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_STATS
           = 0
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_TIMERS
          = 0
dec1281.hsn.de.hpc.ucar.edu 0: PE 0:   MPICH_MPIIO_WRITE_EXIT_BARRIER
          = 1
dec1281.hsn.de.hpc.ucar.edu 0:  (t_initf) Read in prof_inparm namelist
from: pioperf.nl
dec1281.hsn.de.hpc.ucar.edu 0:   Testing decomp:
/glade/derecho/scratch/jedwards/piodecomps/32768/dof001.dat
dec1281.hsn.de.hpc.ucar.edu 0:  iotype=           1  of            1
dec1281.hsn.de.hpc.ucar.edu 0: PE 0: MPIIO hints for pioperf.2-0256-1.nc:
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_pfr             = disable
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_fr_types        = aar
dec1281.hsn.de.hpc.ucar.edu 0:           cb_align                 = 2
dec1281.hsn.de.hpc.ucar.edu 0:           cb_buffer_size           = 16777216
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_fr_alignment    = 1
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_ds_threshold    = 0
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_alltoall        =
automatic
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_read            =
automatic
dec1281.hsn.de.hpc.ucar.edu 0:           romio_cb_write           =
automatic
dec1281.hsn.de.hpc.ucar.edu 0:           romio_no_indep_rw        = false
dec1281.hsn.de.hpc.ucar.edu 0:           romio_ds_write           =
automatic
dec1281.hsn.de.hpc.ucar.edu 0:           ind_wr_buffer_size       = 524288
dec1281.hsn.de.hpc.ucar.edu 0:           romio_ds_read            = disable
dec1281.hsn.de.hpc.ucar.edu 0:           ind_rd_buffer_size       = 4194304
dec1281.hsn.de.hpc.ucar.edu 0:           direct_io                = false
dec1281.hsn.de.hpc.ucar.edu 0:           striping_factor          = 24
dec1281.hsn.de.hpc.ucar.edu 0:           striping_unit            = 16777216
dec1281.hsn.de.hpc.ucar.edu 0:           romio_lustre_start_iodevice = -1
dec1281.hsn.de.hpc.ucar.edu 0:           aggregator_placement_stride = -1
dec1281.hsn.de.hpc.ucar.edu 0:           abort_on_rw_error        = disable
dec1281.hsn.de.hpc.ucar.edu 0:           cb_config_list           = *:*
dec1281.hsn.de.hpc.ucar.edu 0:           cb_nodes                 = 24
dec1281.hsn.de.hpc.ucar.edu 0:           cray_fileoff_based_aggr  = false
dec1281.hsn.de.hpc.ucar.edu 0:           romio_filesystem_type    = CRAY
ADIO:
dec1281.hsn.de.hpc.ucar.edu 0:  RESULT: write    SUBSET         1       256
       64    12202.4897587597       24.8799532720
dec1281.hsn.de.hpc.ucar.edu 0:   complete
dec1281.hsn.de.hpc.ucar.edu 0:   calling mpi finalize


And here it is for cesm:
PE 0:   MPICH_MPIIO_HINTS_DISPLAY                      = 1
PE 0:   MPICH_MPIIO_HINTS                              =
*:cb_nodes=24:striping_factor=24
PE 0:   MPICH_MPIIO_ABORT_ON_RW_ERROR                  = disable
PE 0:   MPICH_MPIIO_CB_ALIGN                           = 2
PE 0:   MPICH_MPIIO_DVS_MAXNODES                       = -1
PE 0:   MPICH_MPIIO_AGGREGATOR_PLACEMENT_DISPLAY       = 0
PE 0:   MPICH_MPIIO_AGGREGATOR_PLACEMENT_STRIDE        = -1
PE 0:   MPICH_MPIIO_MAX_NUM_IRECV                      = 50
PE 0:   MPICH_MPIIO_MAX_NUM_ISEND                      = 50
PE 0:   MPICH_MPIIO_MAX_SIZE_ISEND                     = 10485760
PE 0:   MPICH_MPIIO_OFI_STARTUP_CONNECT                = disable
PE 0:   MPICH_MPIIO_OFI_STARTUP_NODES_AGGREGATOR        = 2
PE 0:   MPICH_MPIIO_STATS                              = 0
PE 0:   MPICH_MPIIO_TIMERS                             = 0
PE 0:   MPICH_MPIIO_WRITE_EXIT_BARRIER                 = 1

PE 0: MPIIO hints for
SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-00000.nc:


          romio_cb_pfr             = disable



          romio_cb_fr_types        = aar



          cb_align                 = 2



          cb_buffer_size           = 16777216



          romio_cb_fr_alignment    = 1



          romio_cb_ds_threshold    = 0



          romio_cb_alltoall        = automatic



          romio_cb_read            = automatic



          romio_cb_write           = automatic



          romio_no_indep_rw        = false



          romio_ds_write           = automatic



          ind_wr_buffer_size       = 524288



          romio_ds_read            = disable



          ind_rd_buffer_size       = 4194304



          direct_io                = false



          striping_factor          = 24



          striping_unit            = 1048576



          romio_lustre_start_iodevice = -1



          aggregator_placement_stride = -1



          abort_on_rw_error        = disable



          cb_config_list           = *:*



          cb_nodes                 = 1



          cray_fileoff_based_aggr  = false



          romio_filesystem_type    = CRAY ADIO:



I'm trying to understand why cesm has cb_nodes=1 even though the
striping_factor=24 and I explicitly set a hint for cb_nodes=24.

On Fri, Aug 4, 2023 at 1:28 PM Wei-Keng Liao <wkliao at northwestern.edu>
wrote:

> Hi, Jim
>
> Could you print all the MPI-IO hints for using the Lustre progressive file
> layout?
> That may reveal some information.
>
> Wei-keng
>
> On Aug 4, 2023, at 1:45 PM, Wei-Keng Liao <wkliao at northwestern.edu> wrote:
>
> Looks like the cesm file is using "Lustre progressive file layout", which
> is
> a new striping strategy. My guess is it is used center-widely by default.
> Rob Latham at ANL has more experiences on this feature. He may have some
> suggestions.
>
> In the meantime, can you write to a new folder and explicitly set its
> Lustre striping count to a larger number, as the total write amount is more
> than 300GB?  This old fashion setting may give a consistent timing.
>
>
> Wei-keng
>
> On Aug 4, 2023, at 1:27 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>
> oh and the pnetcdf version is 1.12.3
>
> On Fri, Aug 4, 2023 at 12:25 PM Jim Edwards <jedwards at ucar.edu> wrote:
>
>> On my list of things to do in PIO is rewriting the error handling code -
>> but that issue is the same for both cases and
>> so I don't think it would play a role in the difference we are seeing.
>>
>> The lfs getstripe output of the two files is nearly identical, I show
>> only the cesm file here.
>>
>>   lcm_layout_gen:    7
>>   lcm_mirror_count:  1
>>   lcm_entry_count:   4
>>     lcme_id:             1
>>     lcme_mirror_id:      0
>>     lcme_flags:          init
>>     lcme_extent.e_start: 0
>>     lcme_extent.e_end:   16777216
>>       lmm_stripe_count:  1
>>       lmm_stripe_size:   1048576
>>       lmm_pattern:       raid0
>>       lmm_layout_gen:    0
>>       lmm_stripe_offset: 27
>>       lmm_objects:
>>       - 0: { l_ost_idx: 27, l_fid: [0xa80000401:0x3f2ef8:0x0] }
>>
>>     lcme_id:             2
>>     lcme_mirror_id:      0
>>     lcme_flags:          init
>>     lcme_extent.e_start: 16777216
>>     lcme_extent.e_end:   17179869184
>>       lmm_stripe_count:  4
>>       lmm_stripe_size:   16777216
>>       lmm_pattern:       raid0
>>       lmm_layout_gen:    0
>>       lmm_stripe_offset: 35
>>       lmm_objects:
>>       - 0: { l_ost_idx: 35, l_fid: [0xc80000401:0x3f32e5:0x0] }
>>       - 1: { l_ost_idx: 39, l_fid: [0xcc0000402:0x3f3162:0x0] }
>>       - 2: { l_ost_idx: 43, l_fid: [0xe80000402:0x3f2f3a:0x0] }
>>       - 3: { l_ost_idx: 47, l_fid: [0xec0000401:0x3f3017:0x0] }
>>
>>     lcme_id:             3
>>     lcme_mirror_id:      0
>>     lcme_flags:          init
>>     lcme_extent.e_start: 17179869184
>>     lcme_extent.e_end:   68719476736
>>       lmm_stripe_count:  12
>>       lmm_stripe_size:   16777216
>>       lmm_pattern:       raid0
>>       lmm_layout_gen:    0
>>       lmm_stripe_offset: 16
>>       lmm_objects:
>>       - 0: { l_ost_idx: 16, l_fid: [0x700000402:0x4021eb:0x0] }
>>       - 1: { l_ost_idx: 20, l_fid: [0x740000402:0x4020ae:0x0] }
>>       - 2: { l_ost_idx: 24, l_fid: [0x900000400:0x401f68:0x0] }
>>       - 3: { l_ost_idx: 28, l_fid: [0x940000400:0x401f71:0x0] }
>>       - 4: { l_ost_idx: 32, l_fid: [0xb00000402:0x40220c:0x0] }
>>       - 5: { l_ost_idx: 36, l_fid: [0xb40000402:0x40210b:0x0] }
>>       - 6: { l_ost_idx: 40, l_fid: [0xd00000402:0x402141:0x0] }
>>       - 7: { l_ost_idx: 44, l_fid: [0xd40000400:0x401e90:0x0] }
>>       - 8: { l_ost_idx: 48, l_fid: [0xf00000401:0x401e08:0x0] }
>>       - 9: { l_ost_idx: 52, l_fid: [0xf40000400:0x401e32:0x0] }
>>       - 10: { l_ost_idx: 56, l_fid: [0x1100000402:0x4022e0:0x0] }
>>       - 11: { l_ost_idx: 60, l_fid: [0x1140000402:0x4020a6:0x0] }
>>
>>     lcme_id:             4
>>     lcme_mirror_id:      0
>>     lcme_flags:          init
>>     lcme_extent.e_start: 68719476736
>>     lcme_extent.e_end:   EOF
>>       lmm_stripe_count:  24
>>       lmm_stripe_size:   16777216
>>       lmm_pattern:       raid0
>>       lmm_layout_gen:    0
>>       lmm_stripe_offset: 51
>>       lmm_objects:
>>       - 0: { l_ost_idx: 51, l_fid: [0x1080000400:0x402881:0x0] }
>>       - 1: { l_ost_idx: 55, l_fid: [0x10c0000400:0x402955:0x0] }
>>       - 2: { l_ost_idx: 59, l_fid: [0x1280000400:0x4027d6:0x0] }
>>       - 3: { l_ost_idx: 63, l_fid: [0x12c0000401:0x402ab2:0x0] }
>>       - 4: { l_ost_idx: 67, l_fid: [0x1480000400:0x402b75:0x0] }
>>       - 5: { l_ost_idx: 71, l_fid: [0x14c0000400:0x4028d2:0x0] }
>>       - 6: { l_ost_idx: 75, l_fid: [0x1680000401:0x402a3b:0x0] }
>>       - 7: { l_ost_idx: 79, l_fid: [0x16c0000402:0x40294d:0x0] }
>>       - 8: { l_ost_idx: 83, l_fid: [0x1880000401:0x40299c:0x0] }
>>       - 9: { l_ost_idx: 87, l_fid: [0x18c0000402:0x402f5e:0x0] }
>>       - 10: { l_ost_idx: 91, l_fid: [0x1a80000400:0x402a16:0x0] }
>>       - 11: { l_ost_idx: 95, l_fid: [0x1ac0000400:0x402bd2:0x0] }
>>       - 12: { l_ost_idx: 0, l_fid: [0x300000401:0x402a2e:0x0] }
>>       - 13: { l_ost_idx: 4, l_fid: [0x340000402:0x4027d2:0x0] }
>>       - 14: { l_ost_idx: 8, l_fid: [0x500000402:0x402a26:0x0] }
>>       - 15: { l_ost_idx: 12, l_fid: [0x540000400:0x402943:0x0] }
>>       - 16: { l_ost_idx: 64, l_fid: [0x1300000402:0x402c10:0x0] }
>>       - 17: { l_ost_idx: 68, l_fid: [0x1340000401:0x4029c6:0x0] }
>>       - 18: { l_ost_idx: 72, l_fid: [0x1500000402:0x402d11:0x0] }
>>       - 19: { l_ost_idx: 76, l_fid: [0x1540000402:0x402be2:0x0] }
>>       - 20: { l_ost_idx: 80, l_fid: [0x1700000400:0x402a64:0x0] }
>>       - 21: { l_ost_idx: 84, l_fid: [0x1740000401:0x402b11:0x0] }
>>       - 22: { l_ost_idx: 88, l_fid: [0x1900000402:0x402cb8:0x0] }
>>       - 23: { l_ost_idx: 92, l_fid: [0x1940000400:0x402cd3:0x0] }
>>
>> On Fri, Aug 4, 2023 at 12:12 PM Wei-Keng Liao <wkliao at northwestern.edu>
>> wrote:
>>
>>>
>>> I can see the file header size of 20620 bytes. Because all attributes
>>> are stored
>>> in the header, the cost of writing them should not be an issue. I also
>>> see no gap
>>> between 2 consecutive variables, which is good, meaning the write
>>> requests made
>>> by MPI-IO will be contiguous.
>>>
>>> If the call sequence of PnetCDF APIs is the same between pioperf and
>>> cesm, then
>>> the performance should be similarly. Can you check the Lustre striping
>>> settings
>>> of the 2 output files, using command "lfs getstripe"?
>>>
>>> If you set any MPI-IO hints, they can also play a role in performance.
>>> See the example in PnetCDF for how to dump all hints (function
>>> print_info().)
>>>
>>> https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/get_info.c
>>> <https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/get_info.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!S2kyuECdrloQfFGddM-B1Y9QnqTEKxfduivqOXqC5UpDlBkIBRYblv--jZO9EebTthfwwDYkKtJkHqMsgmikJ3c$>
>>>
>>> If all the above checked out right, then using Darshan should reveal
>>> more information.
>>>
>>> BTW, what PnetCDF version is being used?
>>>
>>> A comment about PIOc_put_att_tc.
>>> * calling MPI_Bcast for checking the error code may not be necessary.
>>> PnetCDF does such
>>>   check and all metadata consistency check at ncmpi_enddef. If the
>>> number of variables
>>>   and their attributes are high, then calling lots of MPI_Bcast can be
>>> expensive.
>>>
>>> https://github.com/NCAR/ParallelIO/blob/f45ba898bec31e6cd662ac41f43e0cff14f928b2/src/clib/pio_getput_int.c#L213
>>> <https://urldefense.com/v3/__https://github.com/NCAR/ParallelIO/blob/f45ba898bec31e6cd662ac41f43e0cff14f928b2/src/clib/pio_getput_int.c*L213__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!S2kyuECdrloQfFGddM-B1Y9QnqTEKxfduivqOXqC5UpDlBkIBRYblv--jZO9EebTthfwwDYkKtJkHqMsyB5l4Jg$>
>>>
>>>
>>> Wei-keng
>>>
>>> On Aug 4, 2023, at 12:32 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>
>>> Yes, _enddef is called only once.
>>>
>>> Here
>>> <https://urldefense.com/v3/__https://github.com/NCAR/ParallelIO/blob/main/src/clib/pio_getput_int.c*L128__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hw4F6HzDo$> is
>>> the code that writes attributes.  Here
>>> <https://urldefense.com/v3/__https://github.com/NCAR/ParallelIO/blob/main/src/clib/pio_darray_int.c*L661__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwOC-tyLY$> is
>>> where variables are written.
>>>
>>> ncoffsets -sg pioperf.2-0256-1.nc
>>> <https://urldefense.com/v3/__http://pioperf.2-0256-1.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwTlZDyu8$>
>>> netcdf pioperf.2-0256-1.nc
>>> <https://urldefense.com/v3/__http://pioperf.2-0256-1.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwTlZDyu8$>
>>>  {
>>> // file format: CDF-5
>>>
>>> file header:
>>> size   = 7804 bytes
>>> extent = 8192 bytes
>>>
>>> dimensions:
>>> dim000001 = 10485762
>>> dim000002 = 58
>>> time = UNLIMITED // (1 currently)
>>>
>>> record variables:
>>> double vard0001(time, dim000002, dim000001):
>>>       start file offset =        8192    (0th record)
>>>       end   file offset =  4865401760    (0th record)
>>>       size in bytes     =  4865393568    (of one record)
>>>       gap from prev var =         388
>>> double vard0002(time, dim000002, dim000001):
>>>       start file offset =  4865401760    (0th record)
>>>       end   file offset =  9730795328    (0th record)
>>>       size in bytes     =  4865393568    (of one record)
>>>       gap from prev var =           0
>>>
>>>
>>> snip
>>>
>>> double vard0064(time, dim000002, dim000001):
>>>       start file offset =306519802976    (0th record)
>>>       end   file offset =311385196544    (0th record)
>>>       size in bytes     =  4865393568    (of one record)
>>>       gap from prev var =           0
>>> }
>>>
>>>  ncoffsets -sg run/
>>> SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc
>>> <https://urldefense.com/v3/__http://SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwdkm0FHQ$>
>>> netcdf run/
>>> SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc
>>> <https://urldefense.com/v3/__http://SMS_D_Ln9.mpasa7p5_mpasa7p5_mg17.QPC6.derecho_intel.cam-outfrq9s.20230726_094231_iz24v6.cam.h0.0001-01-01-03600.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwdkm0FHQ$>
>>>  {
>>> // file format: CDF-5
>>>
>>> file header:
>>> size   = 20620 bytes
>>> extent = 16777216 bytes
>>>
>>> dimensions:
>>> ncol = 10485762
>>> time = UNLIMITED // (1 currently)
>>> nbnd = 2
>>> chars = 8
>>> lev = 58
>>> ilev = 59
>>>
>>> fixed-size variables:
>>> double lat(ncol):
>>>       start file offset =    16777216
>>>       end   file offset =   100663312
>>>       size in bytes     =    83886096
>>>       gap from prev var =    16756596
>>> double lon(ncol):
>>>       start file offset =   100663312
>>>       end   file offset =   184549408
>>>       size in bytes     =    83886096
>>>       gap from prev var =           0
>>>
>>>
>>> snip
>>>
>>> int    mdt:
>>>       start file offset =   352322552
>>>       end   file offset =   352322556
>>>       size in bytes     =           4
>>>       gap from prev var =           0
>>>
>>> record variables:
>>> double time(time):
>>>       start file offset =   352322556    (0th record)
>>>       end   file offset =   352322564    (0th record)
>>>       size in bytes     =           8    (of one record)
>>>       gap from prev var =           0
>>> int    date(time):
>>>       start file offset =   352322564    (0th record)
>>>       end   file offset =   352322568    (0th record)
>>>       size in bytes     =           4    (of one record)
>>>       gap from prev var =           0
>>>
>>>
>>> snip
>>>
>>> double STEND_CLUBB(time, lev, ncol):
>>>       start file offset =306872117448    (0th record)
>>>       end   file offset =311737511016    (0th record)
>>>       size in bytes     =  4865393568    (of one record)
>>>       gap from prev var =           0
>>> }
>>>
>>> On Fri, Aug 4, 2023 at 10:35 AM Wei-Keng Liao <wkliao at northwestern.edu>
>>> wrote:
>>>
>>>> Can you run command "ncoffsets -sg file.nc
>>>> <https://urldefense.com/v3/__http://file.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!Rb9IHCtwLvKBflvuPIGfD-peS-Hl1-epxN7yjpgkPoFWdSFS3DFGNkKhfb7WqrC_N0TBJDe-1bKKU_hwCtAtD9k$>"
>>>> that shows the sizes of file header
>>>> and all variables? For the cesm case, is _enddef called only once?
>>>>
>>>> Could you also point me to the program files that call PnetCDF APIs,
>>>> including
>>>> writing attributes and variables?
>>>>
>>>>
>>>> Wei-keng
>>>>
>>>> On Aug 4, 2023, at 11:05 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>>
>>>> I am using the new ncar system, derecho
>>>> <https://urldefense.com/v3/__https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwigtcytrsOAAxWXHjQIHVoDD6YQFnoECBcQAQ&url=https*3A*2F*2Farc.ucar.edu*2Fknowledge_base*2F74317833&usg=AOvVaw2aXlWuOfLnua7fFmIgvfoV&opi=89978449__;JSUlJSU!!Dq0X2DkFhyF93HkjWTBQKhk!Xq6u5krREolkIRHG8AL2taDCmg6HsEdgcEoviUVyzqUINi-ipPM1EhtMcJkQfUYghDhutn7DfH5Wjm57wJ9lQhc$>,
>>>> which has a lustre parallel file system.
>>>>
>>>> Looking at the difference between the two headers below makes me wonder
>>>> if the issue is with variable attributes?
>>>>
>>>>
>>>> snip
>>>>
>>>>
>>>> On Fri, Aug 4, 2023 at 9:39 AM Wei-Keng Liao <wkliao at northwestern.edu>
>>>> wrote:
>>>>
>>>>> Hi, Jim
>>>>>
>>>>> Can your provide the test program and the file header dumped by
>>>>> "ncdump -h", if that is available?
>>>>> Also, what machine was used in the tests and its the parallel file
>>>>> system configuration is?
>>>>> These can help diagnose.
>>>>>
>>>>> Wei-keng
>>>>>
>>>>> On Aug 4, 2023, at 8:49 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>>>>>
>>>>> I am using ncmpi_iput_varn and ncmpi_wait_all to write output from my
>>>>> model.   I have a test program that does nothing but test the
>>>>> performance of the write operation.   Attached is a plot of
>>>>> performance in the model and in the standalone application.   I'm looking
>>>>> for
>>>>> clues as to why the model performance is scaling so badly with the
>>>>> number of variables but the standalone program performance is fine.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jim Edwards
>>>>>
>>>>> CESM Software Engineer
>>>>> National Center for Atmospheric Research
>>>>> Boulder, CO
>>>>> <Screenshot 2023-07-27 at 11.49.03 AM.png>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Jim Edwards
>>>>
>>>> CESM Software Engineer
>>>> National Center for Atmospheric Research
>>>> Boulder, CO
>>>>
>>>>
>>>>
>>>
>>> --
>>> Jim Edwards
>>>
>>> CESM Software Engineer
>>> National Center for Atmospheric Research
>>> Boulder, CO
>>>
>>>
>>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineer
>> National Center for Atmospheric Research
>> Boulder, CO
>>
>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>
>
>
>

-- 
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20230804/a09f4b82/attachment-0001.html>


More information about the parallel-netcdf mailing list