<div dir="ltr"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">This should generate an SSE2 binary:</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    'COPTFLAGS=-g',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    'FOPTFLAGS=-g',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">This should generate a KNL binary:</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    'COPTFLAGS=-g -xMIC-AVX512 -O3',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    'FOPTFLAGS=-g -xMIC-AVX512 -O3',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">This should generate a SSE2 binary that also supports CORE-AVX2 dispatch.</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    '--COPTFLAGS=-g -axcore-avx2',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    '--FOPTFLAGS=-g -axcore-avx2',</span><div><br></div><div>I don't see a good reason for the third option to fail.  Please report this bug to Intel.</div><div><br></div><div>You might also verify that this works:</div><div><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    '--COPTFLAGS=-g -xCORE-AVX2',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">    '--FOPTFLAGS=-g -<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">xCORE-AVX2</span>',</span><br class="gmail-Apple-interchange-newline"><br>In general, one should avoid compiling for SSE on KNL, because SSE-AVX transition penalties need to be avoided (google should find the details).  Are you trying to generate a single binary that is portable to ancient Core/Xeon and KNL?  I recommend that you use AVX (Sandy Bridge) <span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">- preferably AVX2 (Haswell) - </span>as your oldest ISA target when generating a portable binary that includes KNL support.</div><div><br></div><div>Jeff<br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 10, 2018 at 2:23 PM, Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I tried a few builds with:<br>
<br>
    '--with-64-bit-indices=1',<br>
    '--with-memalign=64',<br>
    '--with-blaslapack-dir=/home/<wbr>intel/18/compilers_and_<wbr>libraries_2018.0.128/linux/<wbr>mkl',<br>
    '--with-cc=icc',<br>
    '--with-fc=ifort',<br>
    '--with-cxx=0',<br>
    '--with-debugging=0',<br>
    '--with-mpi=0',<br>
<br>
And then changed the OPTFLAGS:<br>
<br>
1.  'basic -g' - works fine<br>
<br>
    'COPTFLAGS=-g',<br>
    'FOPTFLAGS=-g',<br>
<br>
2. 'avx512' - works fine<br>
<br>
    'COPTFLAGS=-g -xMIC-AVX512 -O3',<br>
    'FOPTFLAGS=-g -xMIC-AVX512 -O3',<br>
<br>
3. 'avx2' - breaks.<br>
<br>
    '--COPTFLAGS=-g -axcore-avx2',<br>
    '--FOPTFLAGS=-g -axcore-avx2',<br>
<br>
with a breakpoint at dmdavecrestorearrayf903_() in gdb - I see - the<br>
stack is fine during the first call to dmdavecrestorearrayf903_() -<br>
but is corrupted when it goes to the second call to<br>
dmdavecrestorearrayf903_() i.e ierr=0x7fffffffb4a0 changes to<br>
ierr=0x0]<br>
<br>
>>>>>>>>>><br>
<br>
Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>,<br>
    ierr=0x7fffffffb4a0) at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<wbr>zda1f90.c:153<br>
153     {<br>
(gdb) where<br>
#0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>, ierr=0x7fffffffb4a0)<br>
    at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<wbr>zda1f90.c:153<br>
#1  0x0000000000401abd in test () at ex1f.F90:80<br>
#2  0x00000000004011ae in main ()<br>
#3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6<br>
#4  0x00000000004010b9 in _start ()<br>
(gdb) c<br>
Continuing.<br>
<br>
Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)<br>
    at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<wbr>zda1f90.c:153<br>
153     {<br>
(gdb) where<br>
#0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)<br>
    at /home/petsc/petsc.barry-test/<wbr>src/dm/impls/da/f90-custom/<wbr>zda1f90.c:153<br>
#1  0x0000000000401ada in test () at ex1f.F90:81<br>
#2  0x00000000004011ae in main ()<br>
#3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6<br>
#4  0x00000000004010b9 in _start ()<br>
(gdb)<br>
<br>
>>>>>>>>><br>
<br>
Its not clear to me why this happens. [and why it would work with -xMIC-AVX512 but breaks with -axcore-avx2].<br>
<br>
Perhaps Richard, Jeff have better insight on this.<br>
<br>
BTW: The above run is with:<br>
<br>
bash-4.2$ icc --version<br>
icc (ICC) 18.0.0 20170811<br>
<br>
Satish<br>
<br>
On Mon, 9 Apr 2018, Satish Balay wrote:<br>
<br>
> I'm able to reproduce this problem on knl box [with the attached test code]. But it goes away if I rebuild without the option --with-64-bit-indices.<br>
><br>
> Will have to check further..<br>
><br>
> Satish<br>
><br>
><br>
> On Thu, 5 Apr 2018, Randall Mackie wrote:<br>
><br>
> > Dear PETSc users,<br>
> ><br>
> > I’m curious if anyone else experiences problems using DMDAVecGetArrayF90 in conjunction with Intel compilers?<br>
> > We have had many problems (typically 11 SEGV segmentation violations) when PETSc is compiled in optimize mode (with various combinations of options).<br>
> > These same codes run valgrind clean with gfortran, so I assume this is an Intel bug, but before we submit a bug report I wanted to see if anyone else had similar experiences?<br>
> > We have basically gone back and replaced our calls to DMDAVecGetArrayF90 with calls to VecGetArrayF90 and pass those pointers into a “local” subroutine that works fine.<br>
> ><br>
> > In case anyone is curious, the attached test code shows this behavior when PETSc is compiled with the following options:<br>
> ><br>
> > ./configure \<br>
> >   --with-clean=1 \<br>
> >   --with-debugging=0 \<br>
> >   --with-fortran=1 \<br>
> >   --with-64-bit-indices \<br>
> >   --download-mpich=../mpich-3.<wbr>3a2.tar.gz \<br>
> >   --with-blas-lapack-dir=/opt/<wbr>intel/mkl \<br>
> >   --with-cc=icc \<br>
> >   --with-fc=ifort \<br>
> >   --with-cxx=icc \<br>
> >   --FOPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \<br>
> >   --COPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \<br>
> >   --CXXOPTFLAGS='-O2 -xSSSE3 -axcore-avx2’ \<br>
> ><br>
> ><br>
> ><br>
> > Thanks, Randy M.<br>
> ><br>
> ><br>
> </blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>
</div></div></div>