<div dir="ltr"><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">Hi Wei-Keng,</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d"><br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">Thanks for looking into this. Because the allocations in pioperformance.F90 are done on the compute nodes and</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">not the IO nodes I don't think that your suggestion would make any difference. I also wonder why this issue <br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">appears to be so specific to the lustre file system - presumably the ROMIO functionality you speak of is general and not</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">specific to lustre? Anyway your analysis spurred me to try something else which seems to work: prior to calling</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">ncmpi_iput_varn in pio_darray_int.c I added a call to ncmpi_wait_all to make sure that any existing buffer was written.</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d"><br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">This seems to have fixed the problem and my writes are now</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">RESULT: write SUBSET 1 16 64 4787.6393342631 3.8766495770<br>RESULT: write SUBSET 1 16 64 4803.9296372205 3.8635037150<br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d"><br></div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Aug 14, 2023 at 9:47 AM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="auto">
<div dir="auto">
<div dir="auto">
<div dir="auto">
<div dir="auto">
<div dir="auto">
Hi, Jim
<div><br>
</div>
<div>Digging into ROMIO source codes, I found the root cause of the timing</div>
<div>difference between the two test cases is whether or not the user buffer</div>
<div>passed to MPI_File_write_all is contiguous.</div>
<div><br>
</div>
<div>In your test program, the write buffers for all record variables are</div>
<div>allocated in a <span style="color:rgb(0,0,0)">contiguous space, while the</span> fix-sized variable is in a</div>
<div>separate memory space.</div>
<div><a href="https://github.com/jedwards4b/ParallelIO/blob/25b471d5864db1cf7b8dfa26bd5d568eceba1a04/tests/performance/pioperformance.F90#L220-L227" target="_blank">https://github.com/jedwards4b/ParallelIO/blob/25b471d5864db1cf7b8dfa26bd5d568eceba1a04/tests/performance/pioperformance.F90#L220-L227</a></div>
<div><br>
</div>
<div><span style="color:rgb(0,0,0)"><br>
</span></div>
<div><span style="color:rgb(0,0,0)">Therefore, in case of writing an extra fix-sized variable, the </span><span style="color:rgb(0,0,0)">aggregated</span></div>
<div><span style="color:rgb(0,0,0)">write buffer is noncontiguous, while in the other case </span><span style="color:rgb(0,0,0)">contiguous</span><span style="color:rgb(0,0,0)">.</span></div>
<div><br>
</div>
<div>When the write buffer is not contiguous, ROMIO allocates an internal buffer,</div>
<div>copies the data over, and uses it to perform communication. When the buffer</div>
<div>is contiguous, ROMIO uses the user buffer <span style="color:rgb(0,0,0)">directly for communication</span>.</div>
<div>Such coping can become expensive when the write amount is large.</div>
<div><br>
</div>
<div>If you want to verify this of my finding, please try allocating the</div>
<div>buffers of individual record variables separately. Let me know how it goes.</div>
<div>
<div><br>
</div>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 11, 2023, at 6:03 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Hi Wei-keng,,</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
For this case I'm using a RoundRobin distribution as shown here.<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
if(doftype .eq. 'ROUNDROBIN') then<br>
do i=1,varsize<br>
compmap(i) = (i-1)*npe+mype+1<br>
enddo</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
</div>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote></div><br clear="all"><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div><div>Jim Edwards<br><br></div><font size="1">CESM Software Engineer<br></font></div><font size="1">National Center for Atmospheric Research<br></font></div><font size="1">Boulder, CO</font> <br></div></div>