<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 10, 2018 at 4:39 PM, Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Tue, 10 Apr 2018, Jeff Hammond wrote:<br>
<br>
> This should generate an SSE2 binary:<br>
><br>
> 'COPTFLAGS=-g',<br>
> 'FOPTFLAGS=-g',<br>
><br>
> This should generate a KNL binary:<br>
><br>
> 'COPTFLAGS=-g -xMIC-AVX512 -O3',<br>
> 'FOPTFLAGS=-g -xMIC-AVX512 -O3',<br>
><br>
> This should generate a SSE2 binary that also supports CORE-AVX2 dispatch.<br>
><br>
> '--COPTFLAGS=-g -axcore-avx2',<br>
> '--FOPTFLAGS=-g -axcore-avx2',<br>
><br>
> I don't see a good reason for the third option to fail. Please report this<br>
> bug to Intel.<br>
><br>
> You might also verify that this works:<br>
><br>
> '--COPTFLAGS=-g -xCORE-AVX2',<br>
> '--FOPTFLAGS=-g -xCORE-AVX2',<br>
<br>
</span>This fails the same way as -axcore-avx2<br>
<span class=""><br></span></blockquote><div><br></div><div>Can you try on a non-KNL host? It's a bug either way but I want to determine if KNL host is the issue.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
><br>
> In general, one should avoid compiling for SSE on KNL, because SSE-AVX<br>
> transition penalties need to be avoided (google should find the details).<br>
> Are you trying to generate a single binary that is portable to ancient<br>
> Core/Xeon and KNL?<br>
<br>
</span>My usage here is to reproduce this issue reported by Randy - assumed the knl box we have is the easiest way..<br>
<span class="HOEnZb"><font color="#888888"><br></font></span></blockquote><div><br></div><div>Based only what I see below, Randy doesn't seem to be reporting a KNL-specific issue. Is that incorrect?</div><div><br></div><div>I strongly recommend generating KNL-specific binaries for KNL, in which case, the original issue should be investigated on non-KNL systems.</div><div><br></div><div>Again, there is clearly a bug here, but it helps to localize the problem as much as possible.</div><div><br></div><div>Jeff</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="HOEnZb"><font color="#888888">
Satish<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
> I recommend that you use AVX (Sandy Bridge) -<br>
> preferably AVX2 (Haswell) - as your oldest ISA target when generating a<br>
> portable binary that includes KNL support.<br>
><br>
> Jeff<br>
><br>
> On Tue, Apr 10, 2018 at 2:23 PM, Satish Balay <<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>> wrote:<br>
><br>
> > I tried a few builds with:<br>
> ><br>
> > '--with-64-bit-indices=1',<br>
> > '--with-memalign=64',<br>
> > '--with-blaslapack-dir=/home/<wbr>intel/18/compilers_and_<br>
> > libraries_2018.0.128/linux/<wbr>mkl',<br>
> > '--with-cc=icc',<br>
> > '--with-fc=ifort',<br>
> > '--with-cxx=0',<br>
> > '--with-debugging=0',<br>
> > '--with-mpi=0',<br>
> ><br>
> > And then changed the OPTFLAGS:<br>
> ><br>
> > 1. 'basic -g' - works fine<br>
> ><br>
> > 'COPTFLAGS=-g',<br>
> > 'FOPTFLAGS=-g',<br>
> ><br>
> > 2. 'avx512' - works fine<br>
> ><br>
> > 'COPTFLAGS=-g -xMIC-AVX512 -O3',<br>
> > 'FOPTFLAGS=-g -xMIC-AVX512 -O3',<br>
> ><br>
> > 3. 'avx2' - breaks.<br>
> ><br>
> > '--COPTFLAGS=-g -axcore-avx2',<br>
> > '--FOPTFLAGS=-g -axcore-avx2',<br>
> ><br>
> > with a breakpoint at dmdavecrestorearrayf903_() in gdb - I see - the<br>
> > stack is fine during the first call to dmdavecrestorearrayf903_() -<br>
> > but is corrupted when it goes to the second call to<br>
> > dmdavecrestorearrayf903_() i.e ierr=0x7fffffffb4a0 changes to<br>
> > ierr=0x0]<br>
> ><br>
> > >>>>>>>>>><br>
> ><br>
> > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,<br>
> > v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>,<br>
> > ierr=0x7fffffffb4a0) at /home/petsc/petsc.barry-test/<br>
> > src/dm/impls/da/f90-custom/<wbr>zda1f90.c:153<br>
> > 153 {<br>
> > (gdb) where<br>
> > #0 dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0<br>
> > <test_$VEC2.0.1>, a=0x401abd <test+2301>, ierr=0x7fffffffb4a0)<br>
> > at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<br>
> > zda1f90.c:153<br>
> > #1 0x0000000000401abd in test () at ex1f.F90:80<br>
> > #2 0x00000000004011ae in main ()<br>
> > #3 0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6<br>
> > #4 0x00000000004010b9 in _start ()<br>
> > (gdb) c<br>
> > Continuing.<br>
> ><br>
> > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,<br>
> > v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)<br>
> > at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<br>
> > zda1f90.c:153<br>
> > 153 {<br>
> > (gdb) where<br>
> > #0 dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8<br>
> > <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)<br>
> > at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<br>
> > zda1f90.c:153<br>
> > #1 0x0000000000401ada in test () at ex1f.F90:81<br>
> > #2 0x00000000004011ae in main ()<br>
> > #3 0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6<br>
> > #4 0x00000000004010b9 in _start ()<br>
> > (gdb)<br>
> ><br>
> > >>>>>>>>><br>
> ><br>
> > Its not clear to me why this happens. [and why it would work with<br>
> > -xMIC-AVX512 but breaks with -axcore-avx2].<br>
> ><br>
> > Perhaps Richard, Jeff have better insight on this.<br>
> ><br>
> > BTW: The above run is with:<br>
> ><br>
> > bash-4.2$ icc --version<br>
> > icc (ICC) 18.0.0 20170811<br>
> ><br>
> > Satish<br>
> ><br>
> > On Mon, 9 Apr 2018, Satish Balay wrote:<br>
> ><br>
> > > I'm able to reproduce this problem on knl box [with the attached test<br>
> > code]. But it goes away if I rebuild without the option<br>
> > --with-64-bit-indices.<br>
> > ><br>
> > > Will have to check further..<br>
> > ><br>
> > > Satish<br>
> > ><br>
> > ><br>
> > > On Thu, 5 Apr 2018, Randall Mackie wrote:<br>
> > ><br>
> > > > Dear PETSc users,<br>
> > > ><br>
> > > > I’m curious if anyone else experiences problems using<br>
> > DMDAVecGetArrayF90 in conjunction with Intel compilers?<br>
> > > > We have had many problems (typically 11 SEGV segmentation violations)<br>
> > when PETSc is compiled in optimize mode (with various combinations of<br>
> > options).<br>
> > > > These same codes run valgrind clean with gfortran, so I assume this is<br>
> > an Intel bug, but before we submit a bug report I wanted to see if anyone<br>
> > else had similar experiences?<br>
> > > > We have basically gone back and replaced our calls to<br>
> > DMDAVecGetArrayF90 with calls to VecGetArrayF90 and pass those pointers<br>
> > into a “local” subroutine that works fine.<br>
> > > ><br>
> > > > In case anyone is curious, the attached test code shows this behavior<br>
> > when PETSc is compiled with the following options:<br>
> > > ><br>
> > > > ./configure \<br>
> > > > --with-clean=1 \<br>
> > > > --with-debugging=0 \<br>
> > > > --with-fortran=1 \<br>
> > > > --with-64-bit-indices \<br>
> > > > --download-mpich=../mpich-3.<wbr>3a2.tar.gz \<br>
> > > > --with-blas-lapack-dir=/opt/<wbr>intel/mkl \<br>
> > > > --with-cc=icc \<br>
> > > > --with-fc=ifort \<br>
> > > > --with-cxx=icc \<br>
> > > > --FOPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \<br>
> > > > --COPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \<br>
> > > > --CXXOPTFLAGS='-O2 -xSSSE3 -axcore-avx2’ \<br>
> > > ><br>
> > > ><br>
> > > ><br>
> > > > Thanks, Randy M.<br>
> > > ><br>
> > > ><br>
> > ><br>
><br>
><br>
><br>
><br>
> </div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>
</div></div>