<div dir="ltr">Just an update for the list, Steve seems to have narrowed it down to a compiler bug in the array assignment as Matt suggested.<div>Mark<br><div><br></div><div><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">Quick update: just checking ierr isn’t enough to avoid the compiler bug, but adding zero to n1 is.<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">Compiling the code as sent:<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">    145, Loop not fused: function call before adjacent loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">         Generated vector sse code for the loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">         Generated 2 prefetch instructions for the loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">        …<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">line 145 is the implied assignment loop for apar, which in the test code works. There is no output for the implied loop at the assignment of a_n1 = n1<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">If I add 0D0 to n1:<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">    140, Loop not fused: function call before adjacent loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">         Generated vector sse code for the loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">         Generated 3 prefetch instructions for the loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">         Generated vector sse code for the loop<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">Line 140 is where I do the add. Note we’ve changed from 2 prefetch instructions to 3 .. it’s issuing prefetches for all three assignments, where it used to just doing the last two. Note that the output is correct in this case.<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">Now, for the full code: as written in the repository (the buggy version)<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">scatter_to_xgc:<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2184, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2189, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2194, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2200, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">Line 2200 is where the ‘n1’ assignment occurs, which is the only one that actually works. Lines 2204 and 2208, which are the broken apar and phi assignments, are curiously missing.<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">If I add in the extra error checks (which makes the code work):<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">scatter_to_xgc:<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2184, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2189, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2194, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2201, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2206, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">   2211, Loop not vectorized/parallelized: contains call<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">Lines 2201, 2206, and 2211 are the n1, apar, and phi assignments respectively.<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">I’m 99% sure this is a compiler bug involving the assignments. I’ll still try compiling the full code with ‘-g’ and valgrind, but I’m pretty sure this is it.<u></u><u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri"><u></u> <u></u></span></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:12pt;font-family:"Times New Roman""><span style="font-size:11pt;font-family:Calibri">--steve</span></p></div><div class="gmail_extra"><br></div></div></div>