<div dir="ltr"><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">The cb_nodes was different between the test program and the cesm but I was able</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">to figure out what the issue was and reproduce it in the test program so now cb_nodes is the same for <br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">both files - both are being written by the test program and the only difference between them is that <br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">the slow one has one additional variable which is a scalar and is not a record variable. <br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 9, 2023 at 3:57 PM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu">wkliao@northwestern.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
I thoughts the cb_nodes values are different between the two runs, based
<div>on one of your earlier emails. Can you try the example C program I modified</div>
<div>to include scalar and record variables? It reports timings and cb_nodes value.</div>
<div><br>
</div>
<div><br>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 9, 2023, at 4:22 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
The only difference in the two files is the addition of a single scalar string variable.</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Why would that significantly change this? <br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
I changed the striping on the directory using <span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">
lfs setstripe -c -1 <br>
</span></div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">I
did this because it exaggerates the performance difference. <br>
</span></div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">lmm_stripe_count:
96<br>
lmm_stripe_size: 1048576<br>
lmm_pattern: raid0<br>
lmm_layout_gen: 0<br>
lmm_stripe_offset: 6</span></div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none"><br>
</span></div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">The
original problem was on 32786 tasks - I can now see it on 2048 tasks. <br>
</span></div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Aug 9, 2023 at 3:13 PM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>Googling gave me this.
<div>"The number of the gaps and the size is related to how many seek operation happens and how much is the size of the file in bytes that is skipped to write the next part."
<div><br>
</div>
<div><br>
</div>
<div>Are you still using the default file striping settings?</div>
<div><br>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 9, 2023, at 3:51 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
I spent a little time trying to do this but gave up and went back to using cray profiling tools to get more info.</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
One thing really stands out to me:</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
This is for the fast write:<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0: | number of write
gaps = 2<br>
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0: | ave write gap
size = 9722924978<br>
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0:
<ins>--------------------------------------------------------</ins><br>
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0: RESULT: write SUBSET
1 16 64 4060.0217755460 4.5714040530</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
And this is for the slow one:<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0: | number of write
gaps = 1020<br>
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0: | ave write gap
size = 19079761<br>
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0:
<ins>--------------------------------------------------------</ins><br>
<a href="https://urldefense.com/v3/__http://dec1793.hsn.de.hpc.ucar.edu__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-eNzr_M-E$" target="_blank">dec1793.hsn.de.hpc.ucar.edu</a> 0: RESULT: write SUBSET
1 16 64 76.2558020443 243.3913158400</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Do you understand? <br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Aug 8, 2023 at 11:50 AM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>I have revised the example program to add writes to scalar and record variables.
<div>Let me know if that works for you. URL again is below.<br>
<div><br>
</div>
<div><a href="https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!SP4va2rvVHU4KEb9PEsINCVGTkdEiITT61-aKfQXjWRmlCGrqiRw6rNt8YvXMwO2eOwRL2T7qyjXj_-e9eaX1iY$" target="_blank">https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c</a><br>
<div><br>
</div>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 7, 2023, at 6:10 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
That example doesn't include record variables. Do you have a similar one with record vars?</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Aug 7, 2023 at 4:32 PM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>Hi, Jim
<div><br>
</div>
<div>To eliminate the overheads of PIO, I suggest to use this PnetCDF example program</div>
<div>and add a scalar variable to see if the same happens.</div>
<div><br>
</div>
<div><a href="https://urldefense.com/v3/__https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c__;!!Dq0X2DkFhyF93HkjWTBQKhk!RGlLkVUbuYrrGrSkShv42nz4KqtPJK0FiNzPuYKV-esdwU5UcgKr0xLvQpOooAfY4n2UMB8meSG2ZanhcYgGU_Q$" target="_blank">https://github.com/Parallel-NetCDF/PnetCDF/blob/master/examples/C/nonblocking_write.c</a><br>
</div>
<div></div>
<div><br>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 7, 2023, at 4:28 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Hi Wei-Keng,</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
The cb_nodes doesn't seem to be affected. <br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Not using independent mode doesn't seem to have helped. I have the pioperf program now writing two files. One with only
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
decomposed fields and one with one additional field, rundate, which is a string with the date in it.</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
The performance is drastically different:</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
IO tasks vars Mb/s Time (s)<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
RESULT: write SUBSET 1 256 64 12067.7548254854 25.1577347560 (without scalar)<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
RESULT: write SUBSET 1 256 64 286.4615089145 1059.8190875640 (with scalar)
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Aug 7, 2023 at 1:47 PM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Is that the reason for why cb_nodes is 1?<br>
Strange, because cb_nodes is set at the file open time.<br>
<br>
Entering the independent data mode in PnetCDF can be completely avoided<br>
if using the nonblocking APIs.<br>
<br>
I would suggest your codes to use the nonblocking APIs in the following way.<br>
<br>
/* for non-partitioned variables */<br>
if (rank == 0) {<br>
ncmpi_iput_var_int(fh, varid[0], data[0], &req[0]); /* write the whole variable */<br>
ncmpi_iput_var_int(fh, varid[1], data[1], &req[1]);<br>
...<br>
}<br>
/* for partitioned variables */<br>
ncmpi_iput_vara_int(fh, varid[j], data[j], starts[j], counts[j], &req[j]);<br>
...<br>
<br>
<br>
/* commit all posted nonblocking requests */<br>
ncmpi_wait_all(ncid, NC_REQ_ALL, NC_REQ_NULL, NULL);<br>
<br>
<br>
Wei-keng<br>
<br>
> On Aug 7, 2023, at 2:12 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:<br>
> <br>
> Hi Wei-Keng,<br>
> <br>
> I think that I've found the problem. In the model I am writing a number of scalar variables to the file as well as the decomposed variables.<br>
> for the scalar variables I use a code structure like:<br>
> <br>
> ncmpi_begin_indep_data(fh);<br>
> ncmpi_put_vars_int(fh, varid, start, count, stride, data);<br>
> ncmpi_end_indep_data(fh);<br>
> <br>
> In my pioperf test code I didn't write any scalars - this morning I added one and the write performance for the decomposed variables got very very
<br>
> bad. What can I do about it?<br>
> <br>
> Jim<br>
> <br>
> <br>
> -- <br>
> Jim Edwards<br>
> <br>
> CESM Software Engineer<br>
> National Center for Atmospheric Research<br>
> Boulder, CO <br>
<br>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote></div><br clear="all"><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div><div>Jim Edwards<br><br></div><font size="1">CESM Software Engineer<br></font></div><font size="1">National Center for Atmospheric Research<br></font></div><font size="1">Boulder, CO</font> <br></div></div>