[petsc-dev] VecScatter problem

Mark Adams mfadams at lbl.gov
Fri Aug 26 11:44:12 CDT 2016


Just an update for the list, Steve seems to have narrowed it down to a
compiler bug in the array assignment as Matt suggested.
Mark

Quick update: just checking ierr isn’t enough to avoid the compiler bug,
but adding zero to n1 is.



Compiling the code as sent:

    145, Loop not fused: function call before adjacent loop

         Generated vector sse code for the loop

         Generated 2 prefetch instructions for the loop

        …



line 145 is the implied assignment loop for apar, which in the test code
works. There is no output for the implied loop at the assignment of a_n1 =
n1



If I add 0D0 to n1:

    140, Loop not fused: function call before adjacent loop

         Generated vector sse code for the loop

         Generated 3 prefetch instructions for the loop

         Generated vector sse code for the loop



Line 140 is where I do the add. Note we’ve changed from 2 prefetch
instructions to 3 .. it’s issuing prefetches for all three assignments,
where it used to just doing the last two. Note that the output is correct
in this case.



Now, for the full code: as written in the repository (the buggy version)



scatter_to_xgc:

   2184, Loop not vectorized/parallelized: contains call

   2189, Loop not vectorized/parallelized: contains call

   2194, Loop not vectorized/parallelized: contains call

   2200, Loop not vectorized/parallelized: contains call



Line 2200 is where the ‘n1’ assignment occurs, which is the only one that
actually works. Lines 2204 and 2208, which are the broken apar and phi
assignments, are curiously missing.



If I add in the extra error checks (which makes the code work):



scatter_to_xgc:

   2184, Loop not vectorized/parallelized: contains call

   2189, Loop not vectorized/parallelized: contains call

   2194, Loop not vectorized/parallelized: contains call

   2201, Loop not vectorized/parallelized: contains call

   2206, Loop not vectorized/parallelized: contains call

   2211, Loop not vectorized/parallelized: contains call



Lines 2201, 2206, and 2211 are the n1, apar, and phi assignments
respectively.



I’m 99% sure this is a compiler bug involving the assignments. I’ll still
try compiling the full code with ‘-g’ and valgrind, but I’m pretty sure
this is it.



--steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160826/9150d55b/attachment.html>


More information about the petsc-dev mailing list