<div dir="ltr"><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">I haven't looked at the pattern for this case, but I suspect that it does.</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">It's an mpas hexagonal mesh grid. I'll look into it whether this ROMIO fix is in a more</div><div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:#38761d">recent impi version - I'm currently using 19.0.9<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 28, 2022 at 9:52 AM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
Hi, Jim
<div><br>
</div>
<div>Thanks for the update.</div>
<div><br>
</div>
<div>I am wondering if your I/O pattern produces a large number</div>
<div>of noncontiguous file access requests in each MPI process.</div>
<div>Because ROMIO used MPI tags in its implementation of 2-phase I/O,</div>
<div>this pattern can result in a large number of MPI isend/irecv,</div>
<div>each uses a unique MPI tag. The latest ROMIO has fixed this for</div>
<div>Lustre (<a href="https://github.com/pmodels/mpich/pull/5660" target="_blank">https://github.com/pmodels/mpich/pull/5660</a>.)</div>
<div><br>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Jun 28, 2022, at 9:41 AM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
Hi Wei-Keng,</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
I found the issue with help from TACC user support:<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<a href="https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/technical/large-mpi-tags-with-the-intel-mpi.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!VT-ST0BuSkBPBDMQPJpRTv8AHEeTsBLq5E4LV_kjmYjFUDJoc3Z0jCGiDj3ItqkQ5t8Pgm2wZa23ViXRQ8jKf48$" target="_blank">https://www.intel.com/content/www/us/en/developer/articles/technical/large-mpi-tags-with-the-intel-mpi.html</a></div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">I
set Setting Environment MPIR_CVAR_CH4_OFI_RANK_BITS=15</span><br style="box-sizing:inherit;color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none"> Setting
Environment MPIR_CVAR_CH4_OFI_TAG_BITS=24</span><br style="box-sizing:inherit;color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">and
added a print statement:</span><br style="box-sizing:inherit;color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">cam_restart.F90
123 Maximum tag value queried 8388607</span><br style="box-sizing:inherit;color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial">
<span style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(248,248,248);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">this
appears to be working. <br>
</span></div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Jun 21, 2022 at 7:25 PM Wei-Keng Liao <<a href="mailto:wkliao@northwestern.edu" target="_blank">wkliao@northwestern.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>Hi, Jim
<div><br>
</div>
<div>Is the ncmpi_enddef the first enddef call after the file creation,</div>
<div>or after a ncmpi_redef?</div>
<div><br>
</div>
<div>In the former case, there is no MPI communication in PnetCDF, except</div>
<div>for an MPI_Barrier. In the latter case, if the file header size expands,</div>
<div>existing variables need to be moved to higher offsets, which require</div>
<div>PnetCDF to call MPI collective reads and writes and thus leads to MPI_Issend.</div>
<div><br>
</div>
<div>Can you try to get a coredump so to trace the call stacks?</div>
<div><br>
</div>
<div>You can also enable PnetCDF safe mode which will make additional MPI</div>
<div>communication calls for debugging purpose. Sometimes it helps narrow</div>
<div>down the problem cause. It can be enabled by setting environment</div>
<div>variable PNETCDF_SAFE_MODE to 1.</div>
<div><br>
<div>Wei-keng </div>
<div><br>
<blockquote type="cite">
<div>On Jun 21, 2022, at 5:03 PM, Jim Edwards <<a href="mailto:jedwards@ucar.edu" target="_blank">jedwards@ucar.edu</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
I am using pnetcdf 1.12.3 and getting an error when compiled with intel/19.1.1 and impi/19.0.9 on the TACC Frontera system</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
I am getting very little information to guide me in debugging the error. <br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
[785] Abort(634628) on node 785 (rank 785 in comm 0): Fatal error in PMPI_Issend: Invalid tag, error stack:<br>
[785] PMPI_Issend(156): MPI_Issend(buf=0x2b5c81edf40f, count=1025120, MPI_BYTE, dest=0, tag=1048814, comm=0xc40000d7, request=0x7f2002783540) failed<br>
[785] PMPI_Issend(95).: Invalid tag, value is 1048814<br>
TACC: MPI job exited with code: 4 <br>
TACC: Shutdown complete. Exiting. <br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
<br>
</div>
<div class="gmail_default" style="font-family:comic sans ms,sans-serif;color:rgb(56,118,29)">
I can tell that I am in a call to ncmpi_enddef but not getting anything beyond that - any ideas? <br>
</div>
<br>
-- <br>
<div dir="ltr">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr">
<div dir="ltr">
<div>
<div>
<div>Jim Edwards<br>
<br>
</div>
<font size="1">CESM Software Engineer<br>
</font></div>
<font size="1">National Center for Atmospheric Research<br>
</font></div>
<font size="1">Boulder, CO</font> <br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr"><div dir="ltr"><div><div><div>Jim Edwards<br><br></div><font size="1">CESM Software Engineer<br></font></div><font size="1">National Center for Atmospheric Research<br></font></div><font size="1">Boulder, CO</font> <br></div></div>