<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">
<div>Did you run the same tests on a non-Lustre file system and see no difference?</div>
Can you show me the timings?
<div><br>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 14, 2023, at 11:54 AM, Jim Edwards <jedwards@ucar.edu> wrote:</div>
<br class="Apple-interchange-newline">
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
Hi Wei-Keng,</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
Thanks for looking into this. Because the allocations in pioperformance.F90 are done on the compute nodes and</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
not the IO nodes I don't think that your suggestion would make any difference. I also wonder why this issue
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
appears to be so specific to the lustre file system - presumably the ROMIO functionality you speak of is general and not</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
specific to lustre? Anyway your analysis spurred me to try something else which seems to work: prior to calling</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
ncmpi_iput_varn in pio_darray_int.c I added a call to ncmpi_wait_all to make sure that any existing buffer was written.</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
This seems to have fixed the problem and my writes are now</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
RESULT: write SUBSET 1 16 64 4787.6393342631 3.8766495770<br>
RESULT: write SUBSET 1 16 64 4803.9296372205 3.8635037150<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Aug 14, 2023 at 9:47 AM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="auto">
<div dir="auto">
<div dir="auto">
<div dir="auto">
<div dir="auto">
<div dir="auto">Hi, Jim
<div><br>
</div>
<div>Digging into ROMIO source codes, I found the root cause of the timing</div>
<div>difference between the two test cases is whether or not the user buffer</div>
<div>passed to MPI_File_write_all is contiguous.</div>
<div><br>
</div>
<div>In your test program, the write buffers for all record variables are</div>
<div>allocated in a <span style="">contiguous space, while the</span> fix-sized variable is in a</div>
<div>separate memory space.</div>
<div><a href="https://urldefense.com/v3/__https://github.com/jedwards4b/ParallelIO/blob/25b471d5864db1cf7b8dfa26bd5d568eceba1a04/tests/performance/pioperformance.F90*L220-L227__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!TtiUA1y3rS-K1Ci1HJaI5-nAJVx4QmQHf1GQLtAtYnyfITDBJ9tfJc2Ckg-D6o4KnMowEJ-fG-V_LHZ69baTrKA$" target="_blank">https://github.com/jedwards4b/ParallelIO/blob/25b471d5864db1cf7b8dfa26bd5d568eceba1a04/tests/performance/pioperformance.F90#L220-L227</a></div>
<div><br>
</div>
<div><span style=""><br>
</span></div>
<div><span style="">Therefore, in case of writing an extra fix-sized variable, the </span><span style="">aggregated</span></div>
<div><span style="">write buffer is noncontiguous, while in the other case </span><span style="">contiguous</span><span style="">.</span></div>
<div><br>
</div>
<div>When the write buffer is not contiguous, ROMIO allocates an internal buffer,</div>
<div>copies the data over, and uses it to perform communication. When the buffer</div>
<div>is contiguous, ROMIO uses the user buffer <span style="">directly for communication</span>.</div>
<div>Such coping can become expensive when the write amount is large.</div>
<div><br>
</div>
<div>If you want to verify this of my finding, please try allocating the</div>
<div>buffers of individual record variables separately. Let me know how it goes.</div>
<div>
<div><br>
</div>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Aug 11, 2023, at 6:03 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Hi Wei-keng,,</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
For this case I'm using a RoundRobin distribution as shown here.<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
if(doftype .eq. 'ROUNDROBIN') then<br>
do i=1,varsize<br>
compmap(i) = (i-1)*npe+mype+1<br>
enddo</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
</div>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</body>
</html>