[petsc-dev] VecScatter problem
    Mark Adams 
    mfadams at lbl.gov
       
    Fri Aug 26 11:44:12 CDT 2016
    
    
  
Just an update for the list, Steve seems to have narrowed it down to a
compiler bug in the array assignment as Matt suggested.
Mark
Quick update: just checking ierr isn’t enough to avoid the compiler bug,
but adding zero to n1 is.
Compiling the code as sent:
    145, Loop not fused: function call before adjacent loop
         Generated vector sse code for the loop
         Generated 2 prefetch instructions for the loop
        …
line 145 is the implied assignment loop for apar, which in the test code
works. There is no output for the implied loop at the assignment of a_n1 =
n1
If I add 0D0 to n1:
    140, Loop not fused: function call before adjacent loop
         Generated vector sse code for the loop
         Generated 3 prefetch instructions for the loop
         Generated vector sse code for the loop
Line 140 is where I do the add. Note we’ve changed from 2 prefetch
instructions to 3 .. it’s issuing prefetches for all three assignments,
where it used to just doing the last two. Note that the output is correct
in this case.
Now, for the full code: as written in the repository (the buggy version)
scatter_to_xgc:
   2184, Loop not vectorized/parallelized: contains call
   2189, Loop not vectorized/parallelized: contains call
   2194, Loop not vectorized/parallelized: contains call
   2200, Loop not vectorized/parallelized: contains call
Line 2200 is where the ‘n1’ assignment occurs, which is the only one that
actually works. Lines 2204 and 2208, which are the broken apar and phi
assignments, are curiously missing.
If I add in the extra error checks (which makes the code work):
scatter_to_xgc:
   2184, Loop not vectorized/parallelized: contains call
   2189, Loop not vectorized/parallelized: contains call
   2194, Loop not vectorized/parallelized: contains call
   2201, Loop not vectorized/parallelized: contains call
   2206, Loop not vectorized/parallelized: contains call
   2211, Loop not vectorized/parallelized: contains call
Lines 2201, 2206, and 2211 are the n1, apar, and phi assignments
respectively.
I’m 99% sure this is a compiler bug involving the assignments. I’ll still
try compiling the full code with ‘-g’ and valgrind, but I’m pretty sure
this is it.
--steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160826/9150d55b/attachment.html>
    
    
More information about the petsc-dev
mailing list