<div dir="ltr"><div dir="ltr">On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com">junchao.zhang@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>I found it is a NVIDIA C/C++ compiler bug. I can reproduce it with</div></div></blockquote><div><br></div><div>Great find!</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div style="color:rgb(0,0,0);font-family:Menlo,Monaco,"Courier New",monospace;font-size:14px;line-height:21px;white-space:pre-wrap"><div><span style="color:rgb(175,0,219)">#include</span><span style="color:rgb(0,0,255)"> </span><span style="color:rgb(163,21,21)"><stdlib.h></span></div><div><span style="color:rgb(175,0,219)">#include</span><span style="color:rgb(0,0,255)"> </span><span style="color:rgb(163,21,21)"><stdio.h></span><span style="color:rgb(163,21,21)"><br></span></div><div><span style="color:rgb(175,0,219)">#include</span><span style="color:rgb(0,0,255)"> </span><span style="color:rgb(163,21,21)"><complex.h></span></div><br><div><span style="color:rgb(0,0,255)">typedef</span> <span style="color:rgb(0,0,255)">double</span> <span style="color:rgb(0,0,255)">_Complex</span> PetscScalar;</div><div><span style="color:rgb(0,0,255)">typedef</span> <span style="color:rgb(0,0,255)">struct</span> {</div><div> <span style="color:rgb(0,0,255)">int</span> row;</div><div> PetscScalar *valaddr;</div><div>} MatEntry2;</div><br><div><span style="color:rgb(0,0,255)">int</span> <span style="color:rgb(121,94,38)">main</span>(<span style="color:rgb(0,0,255)">int</span> <span style="color:rgb(0,16,128)">arc</span>, <span style="color:rgb(0,0,255)">char</span>** <span style="color:rgb(0,16,128)">argv</span>)</div><div>{</div><div> <span style="color:rgb(0,0,255)">int</span> i=<span style="color:rgb(9,134,88)">2</span>;</div><div> MatEntry2 *Jentry2 = (MatEntry2*)<span style="color:rgb(121,94,38)">malloc</span>(<span style="color:rgb(9,134,88)">64</span>*<span style="color:rgb(0,0,255)">sizeof</span>(MatEntry2));</div><div> PetscScalar a=<span style="color:rgb(9,134,88)">1</span>, b=<span style="color:rgb(9,134,88)">1</span>;</div><div><br></div><div><div style="line-height:21px;white-space:pre-wrap"><div> <span style="color:rgb(121,94,38)">printf</span>(<span style="color:rgb(163,21,21)">"sizeof(MatEntry2)=</span><span style="color:rgb(0,16,128)">%lu</span><span style="color:rgb(238,0,0)">\n</span><span style="color:rgb(163,21,21)">"</span>,<span style="color:rgb(0,0,255)">sizeof</span>(MatEntry2));</div></div></div><div> <span style="color:rgb(0,16,128)">Jentry2</span>[<span style="color:rgb(9,134,88)">2</span>].<span style="color:rgb(0,16,128)">valaddr</span> = (PetscScalar*)<span style="color:rgb(121,94,38)">malloc</span>(<span style="color:rgb(9,134,88)">16</span>*<span style="color:rgb(0,0,255)">sizeof</span>(PetscScalar));</div><div> *(<span style="color:rgb(0,16,128)">Jentry2</span>[i].<span style="color:rgb(0,16,128)">valaddr</span>) = a*b; // Segfault</div><div><br></div><div> <span style="color:rgb(121,94,38)">free</span>(<span style="color:rgb(0,16,128)">Jentry2</span>[<span style="color:rgb(9,134,88)">2</span>].<span style="color:rgb(0,16,128)">valaddr</span>);</div><div> <span style="color:rgb(121,94,38)">free</span>(Jentry2);</div><div> <span style="color:rgb(175,0,219)">return</span> <span style="color:rgb(9,134,88)">0</span>;</div><div>}</div></div></div><div><br></div><font face="monospace">$ nvc -O0 -o test test.c<br>$ ./test<br>sizeof(MatEntry2)=16<br>Segmentation fault (core dumped)</font><div><br></div>If I change <font face="monospace">*(Jentry2[i].valaddr) = a*b;</font> to<br><br><font face="monospace">PetscScalar *p = Jentry2[2].valaddr;<br>*p = a*b;</font><div><br></div><div>Then the code works fine. Using -O0 to -O2 will also avoid this error for this simple test, but not for PETSc. In PETSc, I could apply the above silly trick, but I am not sure it is worth it. We should instead report it to NVIDIA.</div><div><br></div><div>Looking at the assembly code for the segfault line, we can find the problem</div><font face="monospace"> movslq 52(%rsp), %rcx<br> movq 40(%rsp), %rax<br> movq 8(%rax,%rcx,8), %rax // Here %rax = &Jentry2, %rcx = i; The instruction wrongly calculates <span style="color:rgb(0,16,128)">Jentry2</span>[<span style="color:rgb(9,134,88)">2</span>].<span style="color:rgb(0,16,128)">valaddr as (%rax + %rcx*8)+8, which should instead be </span><span style="color:rgb(0,16,128)">(%rax + %rcx*16)+8</span><br> vmovsd %xmm1, 8(%rax)<br></font><div><font face="monospace"> vmovsd %xmm0, (%rax) </font></div><div><br></div><div>--Junchao Zhang<br></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi, Jon,<div> I could reproduce the error exactly. I will have a look.</div><div> Thanks for reporting.<br clear="all"><div><div dir="ltr"><div dir="ltr">--Junchao Zhang</div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <<a href="mailto:halverson@princeton.edu" target="_blank">halverson@princeton.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<span style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Hello,</span><br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
We are unable to build PETSc using the <span style="color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif;font-size:12pt">NVIDIA HPC SDK and
<span style="background-color:rgb(255,255,255);display:inline">
--with-scalar-type=complex</span>. Below is our procedure:</span></div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<p style="color:rgb(32,31,30);font-size:15px;margin:0px">
<span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black;font-variant-ligatures:no-common-ligatures">$ module load<span> </span></span><span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black;font-variant-ligatures:no-common-ligatures"><span style="margin:0px">nvhpc</span>/21.11</span></p>
<p style="color:rgb(32,31,30);font-size:15px;margin:0px">
<span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black">$ module load openmpi/<span style="margin:0px">nvhpc</span>-21.11/4.1.2/64</span></p>
<div style="margin:0px;font-size:15px;color:rgb(32,31,30);min-height:21px">
<span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black;font-variant-ligatures:no-common-ligatures">$ git clone -b release <a href="https://gitlab.com/petsc/petsc.git" target="_blank">https://gitlab.com/petsc/petsc.git</a> petsc; </span><span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black">cd
petsc</span><br>
<span style="margin:0px;font-variant-ligatures:no-common-ligatures"></span></div>
<p style="color:rgb(32,31,30);font-size:15px;margin:0px">
<span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black;font-variant-ligatures:no-common-ligatures">$ ./configure --with-debugging=1 --with-scalar-type=complex PETSC_ARCH=openmpi-power</span></p>
<p style="color:rgb(32,31,30);font-size:15px;margin:0px">
<span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black;font-variant-ligatures:no-common-ligatures">$ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all</span></p>
<p style="color:rgb(32,31,30);font-size:15px;margin:0px">
<span style="margin:0px;font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:black;font-variant-ligatures:no-common-ligatures">$ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check</span></p>
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
"make check" fails with a segmentation fault when running ex19. The fortran test ex5f passes.</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<span style="background-color:rgb(255,255,255);display:inline">The procedure above fails on x86_64 and POWER both running RHEL8. It also fails using nvhpc 20.7.</span><br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
The procedure above works for "real" instead of "complex".</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
A "hello world" MPI code using a complex data type works with our nvhpc modules.</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
The procedure above works successfully when GCC and an Open MPI library built using GCC is used.</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
The only trouble is the combination of PETSc with nvhpc and complex. Any known issues?</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
The build log for the procedure above is here:</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<a href="https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log" target="_blank">https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log</a><br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Jon</div>
</div>
</blockquote></div>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>