<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div><div><div><blockquote type="cite" class=""><div class=""><blockquote type="cite" cite="mid:260581B5-04AC-4B8D-8A91-236401709785@joliv.et" class=""><div class="">2) I can reproduce the src/mat/tests/ex242.c error (which explicitly uses ScaLAPACK, none of the above PC uses it explicitly, except PCBDDC/PCHPDDM when using MUMPS on “big” problems where root nodes are factorized using ScaLAPACK, see -mat_mumps_icntl_13)</div><div class="">3) I’m seeing that both on your machine and mine, PETSc BuildSystem insist on linking libmkl_blacs_intelmpi_lp64.so even though we supply explicitly libmkl_blacs_openmpi_lp64.so</div><div class="">This for example yields a wrong Makefile.inc for MUMPS:</div><div class="">$ cat arch-linux2-c-opt-ompi/externalpackages/MUMPS_5.3.5/Makefile.inc|grep blacs</div><div class=""><div class="">SCALAP  = […] -lmkl_blacs_openmpi_lp64</div><div class="">LIBBLAS = […] -lmkl_blacs_intelmpi_lp64 -lgomp -ldl -lpthread -lm […]</div></div><div class=""><br class=""></div><div class="">Despite what Barry says, I think PETSc is partially to blame as well (why use libmkl_blacs_intelmpi_lp64.so even though BuildSystem is capable of detecting we are using OpenMPI).</div><div class="">I’ll try to fix this to see if it solves 2).</div></blockquote><p class="">Okay, that's a very nice finding!!!  Hope it will be "fixable" easily!</p><div class=""><br class=""></div></div></blockquote>The knowledge is there but the information may not be trivially available to make the right decisions. Parts of the BLAS/LAPACK checks use the "check everything" approach. For example</div><div><br class=""></div><div><div># Look for Multi-Threaded MKL for MKL_C/Pardiso</div><div>      useCPardiso=0</div><div>      usePardiso=0</div><div>      if self.argDB['with-mkl_cpardiso'] or 'with-mkl_cpardiso-dir' in self.argDB or 'with-mkl_cpardiso-lib' in self.argDB:</div><div>        useCPardiso=1</div><div>        mkl_blacs_64=[['mkl_blacs_intelmpi'+ILP64+''],['mkl_blacs_mpich'+ILP64+''],['mkl_blacs_sgimpt'+ILP64+''],['mkl_blacs_openmpi'+ILP64+'']]</div><div>        mkl_blacs_32=[['mkl_blacs_intelmpi'],['mkl_blacs_mpich'],['mkl_blacs_sgimpt'],['mkl_blacs_openmpi']]</div><div>      elif self.argDB['with-mkl_pardiso'] or 'with-mkl_pardiso-dir' in self.argDB or 'with-mkl_pardiso-lib' in self.argDB:</div><div>        usePardiso=1</div><div>        mkl_blacs_64=[[]]</div><div>        mkl_blacs_32=[[]]</div><div>      if useCPardiso or usePardiso:</div><div>        self.logPrintBox('BLASLAPACK: Looking for Multithreaded MKL for C/Pardiso')</div><div>        for libdir in [os.path.join('lib','64'),os.path.join('lib','ia64'),os.path.join('lib','em64t'),os.path.join('lib','intel64'),'lib','64','ia64','em64t','intel64',</div><div>                       os.path.join('lib','32'),os.path.join('lib','ia32'),'32','ia32','']:</div><div>          if not os.path.exists(os.path.join(dir,libdir)):</div><div>            self.logPrint('MKL Path not found.. skipping: '+os.path.join(dir,libdir))</div><div>          else:</div><div>            self.log.write('Files and directories in that directory:\n'+str(os.listdir(os.path.join(dir,libdir)))+'\n')</div><div>            #  iomp5 is provided by the Intel compilers on MacOS. Run source /opt/intel/bin/compilervars.sh intel64 to have it added to LIBRARY_PATH</div><div>            #  then locate libimp5.dylib in the LIBRARY_PATH and copy it to os.path.join(dir,libdir)</div><div>            for i in mkl_blacs_64:</div><div>              yield ('User specified MKL-C/Pardiso Intel-Linux64', None, [os.path.join(dir,libdir,'libmkl_intel'+ILP64+'.a'),'mkl_core','mkl_intel_thread']+i+['iomp5','dl','pthread'],known,'yes')</div><div>              yield ('User specified MKL-C/Pardiso GNU-Linux64', None, [os.path.join(dir,libdir,'libmkl_intel'+ILP64+'.a'),'mkl_core','mkl_gnu_thread']+i+['gomp','dl','pthread'],known,'yes')</div><div>              yield ('User specified MKL-Pardiso Intel-Windows64', None, [os.path.join(dir,libdir,'mkl_core.lib'),'mkl_intel'+ILP64+'.lib','mkl_intel_thread.lib']+i+['libiomp5md.lib'],known,'yes')</div><div>            for i in mkl_blacs_32:</div><div>              yield ('User specified MKL-C/Pardiso Intel-Linux32', None, [os.path.join(dir,libdir,'libmkl_intel.a'),'mkl_core','mkl_intel_thread']+i+['iomp5','dl','pthread'],'32','yes')</div><div>              yield ('User specified MKL-C/Pardiso GNU-Linux32', None, [os.path.join(dir,libdir,'libmkl_intel.a'),'mkl_core','mkl_gnu_thread']+i+['gomp','dl','pthread'],'32','yes')</div><div>              yield ('User specified MKL-Pardiso Intel-Windows32', None, [os.path.join(dir,libdir,'mkl_core.lib'),'mkl_intel_c.lib','mkl_intel_thread.lib']+i+['libiomp5md.lib'],'32','yes')</div><div>        return</div><div><br class=""></div><div>The assumption is that the link will fail unless the correct libraries are in the list. But apparently this is not the case; it returns the first case that links but that case does not run which is why it appears to be producing "silly" results.</div><div><br class=""></div><div>If you set the right MPI and threading library, at these locations instead of trying all of them it might resolve the problems. </div><div><br class=""></div><div><div>    if self.openmp.found:</div><div>      ITHREAD='intel_thread'</div><div>      ITHREADGNU='gnu_thread'</div><div>      ompthread = 'yes'</div><div>    else:</div><div>      ITHREAD='sequential'</div><div>      ITHREADGNU='sequential'</div><div>      ompthread = 'no'</div><div><br class=""></div><div><div>        mkl_blacs_64=[['mkl_blacs_intelmpi'+ILP64+''],['mkl_blacs_mpich'+ILP64+''],['mkl_blacs_sgimpt'+ILP64+''],['mkl_blacs_openmpi'+ILP64+'']]</div><div>        mkl_blacs_32=[['mkl_blacs_intelmpi'],['mkl_blacs_mpich'],['mkl_blacs_sgimpt'],['mkl_blacs_openmpi']]</div></div></div><div><br class=""></div><div><br class=""></div><div><br class=""></div><div><br class=""></div><blockquote type="cite" class=""><div class=""><br class="webkit-block-placeholder"></div></blockquote><br class=""></div></div><div><br class=""><blockquote type="cite" class=""><div class="">On Mar 3, 2021, at 8:22 AM, Eric Chamberland <<a href="mailto:Eric.Chamberland@giref.ulaval.ca" class="">Eric.Chamberland@giref.ulaval.ca</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
  
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
  
  <div class=""><p class="">Hi Pierre,</p>
    <div class="moz-cite-prefix">On 2021-03-03 2:42 a.m., Pierre Jolivet
      wrote:<br class="">
    </div>
    <blockquote type="cite" cite="mid:260581B5-04AC-4B8D-8A91-236401709785@joliv.et" class="">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
      <blockquote type="cite" class="">
        <div class=""><p class="">If it ends that there is a problem combining MKL +
            openMP that relies on linking configuration for example,
            should it be a good thing to have this (--with-openmp=1)
            tested into the pipelines (with external packages of
            course)?</p>
        </div>
      </blockquote>
      <div class="">As Barry said, there is not much (if any) OpenMP in
        PETSc.</div>
      <div class="">There is however some workers with the MKL (+ Intel
        compilers) turned on, but I don’t think we test MKL + GNU
        compilers (which I feel like is a very niche combination, hence
        not really worth testing, IMHO).</div>
    </blockquote><p class="">Ouch, this is my almost my personal working configuration and for
      most of our users too... and it worked well until I activated the
      OpenMP thing...<br class="">
    </p><p class="">We had good reasons to work with g++ or clang++ instead of intel
      compilers:</p><p class="">- It is mandatory to pay to work with an intel compiler (didn't
      looked at OneAPI licensing yet, but it may have changed?)<br class="">
    </p><p class="">- No support of Intel compilers with iceccd (slow recompilation)<br class="">
    </p><p class="">- MKL was freely distributed, so it can be used with any compiler</p><p class="">That doesn't mean we don't want to use intel compiler, but maybe
      we just want to to a specific delivery with it but continue to
      develop with g++ or clang++ (my personal choice).</p><p class="">But I understand it is less straightforward to combine gcc and
      MKL than using native Intel tool-chain....<br class=""></p></div></div></blockquote><div><br class=""></div>    I agree the MKL + GNU compilers is commonly used and should be tested and maintained in PETSc. <br class=""><blockquote type="cite" class=""><div class=""><div class=""><p class="">
    </p>
    <blockquote type="cite" cite="mid:260581B5-04AC-4B8D-8A91-236401709785@joliv.et" class="">
      <blockquote type="cite" class="">
        <div class=""><p class="">Does the guys who maintain all these libs are
            reading petsc-dev? ;)</p>
        </div>
      </blockquote>
      <div class="">I don’t think they are, but don’t worry, we do
        forward the appropriate messages to them :)</div>
    </blockquote>
    :)<br class="">
    <blockquote type="cite" cite="mid:260581B5-04AC-4B8D-8A91-236401709785@joliv.et" class="">
      <div class=""><br class="">
      </div>
      About yesterday’s failures…
      <div class="">1) I cannot reproduce any of the
        PCHYPRE/PCBDDC/PCHPDDM errors (sorry I didn’t bother putting the
        SuperLU_DIST tarball on my cluster)</div>
    </blockquote><p class="">Hmmm, maybe my environment variables may play a role into this?</p><p class="">for comparisons considerations, we explicitly set:</p><p class="">export MKL_CBWR=COMPATIBLE<br class="">
      export MKL_NUM_THREADS=1<br class="">
    </p><p class="">but it would be surprising it helps reproduce a problem: they
      usually stabilize results...<br class="">
    </p>
    </div></div></blockquote></div><div><blockquote type="cite" class=""><div class=""><div class=""><p class="">Merci,</p><p class="">Eric<br class="">
    </p>
    <blockquote type="cite" cite="mid:260581B5-04AC-4B8D-8A91-236401709785@joliv.et" class="">
      <div class=""><br class="">
      </div>
      <div class="">Thanks,</div>
      <div class="">Pierre</div>
      <div class=""><br class="">
      </div>
      <div class=""><a href="http://joliv.et/irene-rome-configure.log" class="" moz-do-not-send="true">http://joliv.et/irene-rome-configure.log</a></div>
      <div class="">
        <div class="">$ /usr/bin/gmake -f gmakefile test test-fail=1</div>
        <div class="">Using MAKEFLAGS: test-fail=1</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_baij.counts</div>
        <div class=""> ok snes_tutorials-ex12_quad_hpddm_reuse_baij</div>
        <div class=""> ok diff-snes_tutorials-ex12_quad_hpddm_reuse_baij</div>
        <div class="">        TEST
          arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex50_tut_2.counts</div>
        <div class=""> ok ksp_ksp_tutorials-ex50_tut_2 # SKIP
          PETSC_HAVE_SUPERLU_DIST requirement not met</div>
        <div class="">        TEST
          arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_hypre.counts</div>
        <div class=""> ok snes_tutorials-ex56_hypre</div>
        <div class=""> ok diff-snes_tutorials-ex56_hypre</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_elas.counts</div>
        <div class=""> ok snes_tutorials-ex17_3d_q3_trig_elas</div>
        <div class=""> ok diff-snes_tutorials-ex17_3d_q3_trig_elas</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij.counts</div>
        <div class=""> ok
          snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij</div>
        <div class=""> ok
          diff-snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_tri_parmetis_hpddm_baij.counts</div>
        <div class=""> ok snes_tutorials-ex12_tri_parmetis_hpddm_baij</div>
        <div class=""> ok
          diff-snes_tutorials-ex12_tri_parmetis_hpddm_baij</div>
        <div class="">        TEST
          arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_tut_3.counts</div>
        <div class=""> ok snes_tutorials-ex19_tut_3</div>
        <div class=""> ok diff-snes_tutorials-ex19_tut_3</div>
        <div class="">        TEST
          arch-linux2-c-opt-ompi/tests/counts/mat_tests-ex242_3.counts</div>
        <div class="">not ok mat_tests-ex242_3 # Error code: 137</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR:
          ------------------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: Caught signal number 11 SEGV: Segmentation Violation,
          probably memory access out of range</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: Try option -start_in_debugger or
          -on_error_attach_debugger</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: or see <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind" class="" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a></div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: or try <a href="http://valgrind.org/" class="" moz-do-not-send="true">http://valgrind.org</a> on GNU/linux
          and Apple Mac OS X to find memory corruption errors</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: configure using --with-debugging=yes, recompile, link,
          and run</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: to get more information on the crash.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: --------------------- Error Message
          --------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: Signal received</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: See <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html" class="" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html</a>
          for trouble shooting.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: Petsc Development GIT revision: v3.14.4-733-g7ab9467ef9
           GIT Date: 2021-03-02 16:15:11 +0000</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR:
          ------------------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: Caught signal number 11 SEGV: Segmentation Violation,
          probably memory access out of range</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: Try option -start_in_debugger or
          -on_error_attach_debugger</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: or see <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind" class="" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a></div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: or try <a href="http://valgrind.org/" class="" moz-do-not-send="true">http://valgrind.org</a> on GNU/linux
          and Apple Mac OS X to find memory corruption errors</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: configure using --with-debugging=yes, recompile, link,
          and run</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: to get more information on the crash.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: --------------------- Error Message
          --------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: Signal received</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: See <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html" class="" moz-do-not-send="true">https://www.mcs.anl.gov/petsc/documentation/faq.html</a>
          for trouble shooting.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: Petsc Development GIT revision: v3.14.4-733-g7ab9467ef9
           GIT Date: 2021-03-02 16:15:11 +0000</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR:
/ccc/work/cont003/rndm/rndm/petsc/arch-linux2-c-opt-ompi/tests/mat/tests/runex242_3/../ex242
          on a arch-linux2-c-opt-ompi named irene4047 by jolivetp Wed
          Mar  3 08:21:20 2021</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: Configure options --download-hpddm
          --download-hpddm-commit=origin/main --download-hypre
          --download-metis --download-mumps --download-parmetis
          --download-ptscotch --download-slepc
          --download-slepc-commit=origin/main --download-tetgen
          --known-mpi-c-double-complex --known-mpi-int64_t
          --known-mpi-long-double --with-avx512-kernels=1
--with-blaslapack-dir=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/lib/intel64
          --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0
          --with-fc=mpifort --with-fortran-bindings=0 --with-make-np=40
--with-mkl_cpardiso-dir=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281
          --with-mkl_cpardiso=1
--with-mkl_pardiso-dir=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl
          --with-mkl_pardiso=1 --with-mpiexec=ccc_mprun --with-openmp=1
--with-packages-download-dir=/ccc/cont003/home/enseeiht/jolivetp/Dude/externalpackages/
--with-scalapack-include=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/include
--with-scalapack-lib="[/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/lib/intel64/libmkl_scalapack_lp64.so,/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so]"
          --with-scalar-type=real --with-x=0 COPTFLAGS="-O3 -fp-model
          fast -mavx2" CXXOPTFLAGS="-O3 -fp-model fast -mavx2"
          FOPTFLAGS="-O3 -fp-model fast -mavx2"
          PETSC_ARCH=arch-linux2-c-opt-ompi</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: #1 User provided function() line 0 in  unknown file</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[2]PETSC
          ERROR: Run with -malloc_debug to check if memory corruption is
          causing the crash.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR:
/ccc/work/cont003/rndm/rndm/petsc/arch-linux2-c-opt-ompi/tests/mat/tests/runex242_3/../ex242
          on a arch-linux2-c-opt-ompi named irene4047 by jolivetp Wed
          Mar  3 08:21:20 2021</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: Configure options --download-hpddm
          --download-hpddm-commit=origin/main --download-hypre
          --download-metis --download-mumps --download-parmetis
          --download-ptscotch --download-slepc
          --download-slepc-commit=origin/main --download-tetgen
          --known-mpi-c-double-complex --known-mpi-int64_t
          --known-mpi-long-double --with-avx512-kernels=1
--with-blaslapack-dir=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/lib/intel64
          --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0
          --with-fc=mpifort --with-fortran-bindings=0 --with-make-np=40
--with-mkl_cpardiso-dir=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281
          --with-mkl_cpardiso=1
--with-mkl_pardiso-dir=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl
          --with-mkl_pardiso=1 --with-mpiexec=ccc_mprun --with-openmp=1
--with-packages-download-dir=/ccc/cont003/home/enseeiht/jolivetp/Dude/externalpackages/
--with-scalapack-include=/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/include
--with-scalapack-lib="[/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/lib/intel64/libmkl_scalapack_lp64.so,/ccc/products/mkl-19.0.5.281/intel--19.0.5.281__openmpi--4.0.1/default/19.0.5.281/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so]"
          --with-scalar-type=real --with-x=0 COPTFLAGS="-O3 -fp-model
          fast -mavx2" CXXOPTFLAGS="-O3 -fp-model fast -mavx2"
          FOPTFLAGS="-O3 -fp-model fast -mavx2"
          PETSC_ARCH=arch-linux2-c-opt-ompi</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: #1 User provided function() line 0 in  unknown file</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>[1]PETSC
          ERROR: Run with -malloc_debug to check if memory corruption is
          causing the crash.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>--------------------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>MPI_ABORT
          was invoked on rank 2 in communicator MPI_COMM_WORLD</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>with
          errorcode 50176059.</div>
        <div class="">#</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>NOTE:
          invoking MPI_ABORT causes Open MPI to kill all MPI processes.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>You
          may or may not see output from other processes, depending on</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>exactly
          when Open MPI kills them.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>--------------------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>--------------------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>MPI_ABORT
          was invoked on rank 1 in communicator MPI_COMM_WORLD</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>with
          errorcode 50176059.</div>
        <div class="">#</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>NOTE:
          invoking MPI_ABORT causes Open MPI to kill all MPI processes.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>You
          may or may not see output from other processes, depending on</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>exactly
          when Open MPI kills them.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>--------------------------------------------------------------------------</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>srun:
          Job step aborted: Waiting up to 302 seconds for job step to
          finish.</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>slurmstepd-irene4047:
          error: *** STEP 1374176.36 ON irene4047 CANCELLED AT
          2021-03-03T08:21:20 ***</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>srun:
          error: irene4047: task 0: Killed</div>
        <div class="">#<span class="Apple-tab-span" style="white-space:pre">  </span>srun:
          error: irene4047: tasks 1-2: Exited with exit code 16</div>
        <div class=""> ok mat_tests-ex242_3 # SKIP Command failed so no
          diff</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_vlap.counts</div>
        <div class=""> ok snes_tutorials-ex17_3d_q3_trig_vlap</div>
        <div class=""> ok diff-snes_tutorials-ex17_3d_q3_trig_vlap</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre.counts</div>
        <div class=""> ok
          snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre</div>
        <div class=""> ok
          diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex49_hypre_nullspace.counts</div>
        <div class=""> ok ksp_ksp_tutorials-ex49_hypre_nullspace</div>
        <div class=""> ok diff-ksp_ksp_tutorials-ex49_hypre_nullspace</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/ts_tutorials-ex18_p1p1_xper_ref.counts</div>
        <div class=""> ok ts_tutorials-ex18_p1p1_xper_ref</div>
        <div class=""> ok diff-ts_tutorials-ex18_p1p1_xper_ref</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/ts_tutorials-ex18_p1p1_xyper_ref.counts</div>
        <div class=""> ok ts_tutorials-ex18_p1p1_xyper_ref</div>
        <div class=""> ok diff-ts_tutorials-ex18_p1p1_xyper_ref</div>
        <div class="">        TEST
arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre.counts</div>
        <div class=""> ok
          snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre</div>
        <div class=""> ok
          diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre</div>
        <div class="">        TEST
          arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex64_1.counts</div>
        <div class=""> ok ksp_ksp_tutorials-ex64_1 # SKIP
          PETSC_HAVE_SUPERLU_DIST requirement not met</div>
        <div class=""><br class="">
          <blockquote type="cite" class="">
            <div class="">On 3 Mar 2021, at 6:21 AM, Eric Chamberland
              <<a href="mailto:Eric.Chamberland@giref.ulaval.ca" class="" moz-do-not-send="true">Eric.Chamberland@giref.ulaval.ca</a>>
              wrote:</div>
            <br class="Apple-interchange-newline">
            <div class="">
              <meta http-equiv="Content-Type" content="text/html;
                charset=UTF-8" class="">
              <div class=""><p class="">Just started a discussion on the side:</p><p class=""><a class="moz-txt-link-freetext" href="https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-Link-Line-Advisor-as-external-tool/m-p/1260895#M30974" moz-do-not-send="true">https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-Link-Line-Advisor-as-external-tool/m-p/1260895#M30974</a></p><p class="">Eric<br class="">
                </p>
                <div class="moz-cite-prefix">On 2021-03-02 3:50 p.m.,
                  Pierre Jolivet wrote:<br class="">
                </div>
                <blockquote type="cite" cite="mid:FF313618-3465-474B-B13E-A7FDA1FB074D@joliv.et" class="">
                  <meta http-equiv="Content-Type" content="text/html;
                    charset=UTF-8" class="">
                  Hello Eric,
                  <div class="">
                    <div class=""><span style="white-space: pre-wrap;" class="">src/mat/tests/ex237.c is a recent test with some code paths that should be disabled for “old” MKL versions. It’s tricky to check directly in the source (we do check in BuildSystem) because there is no such thing as PETSC_PKG_MKL_VERSION_LT, but I guess we can change if defined(PETSC_HAVE_MKL) to if defined(PETSC_HAVE_MKL) && defined(PETSC_HAVE_MKL_SPARSE_OPTIMIZE), I’ll make a MR, thanks for reporting this.</span></div>
                    <div class="">
                      <div class=""><span style="white-space: pre-wrap;" class="">
</span></div>
                      <div class="">
                        <div class="">For the other issues, I’m sensing
                          this is a problem with gomp +
                          intel_gnu_thread, but this is pure
                          speculation… sorry.</div>
                        <div class="">I’ll try to reproduce some of
                          these problems if you are not given a more
                          meaningful answer.</div>
                      </div>
                      <div class=""><span style="white-space: pre-wrap;" class="">
</span></div>
                      <div class=""><span style="white-space: pre-wrap;" class="">Thanks,</span></div>
                      <div class=""><span style="white-space: pre-wrap;" class="">Pierre</span></div>
                    </div>
                    <div class="">
                      <div class=""><span style="white-space: pre-wrap;" class="">
</span></div>
                      <div class="">
                        <blockquote type="cite" class="">
                          <div class="">On 2 Mar 2021, at 9:14 PM, Eric
                            Chamberland <<a href="mailto:Eric.Chamberland@giref.ulaval.ca" class="" moz-do-not-send="true">Eric.Chamberland@giref.ulaval.ca</a>>
                            wrote:</div>
                          <br class="Apple-interchange-newline">
                          <div class="">
                            <meta http-equiv="content-type" content="text/html; charset=UTF-8" class="">
                            <div class=""><p class="">Hi,</p><p class="">It all started when I wanted
                                to test PETSC/CUDA compatibility for our
                                code.</p><p class="">I had to activate
                                --with-openmp to configure with
                                --with-cuda=1 successfully.</p><p class="">I then saw that
                                PETSC_HAVE_OPENMP  is used at least in
                                MUMPS (and some other places).</p><p class="">So, I configured and tested
                                petsc with openmp activated, without
                                CUDA.<br class="">
                              </p><p class="">The first thing I see is that
                                our code CI pipelines now fails for many
                                tests.</p><p class="">After looking deeper, it seems
                                that PETSc itself fails many tests when
                                I activate openmp!</p>
                              Here are all the configurations I have
                              results for, after/before activating
                              OpenMP for PETSc:<br class=""><p class="">==============================================================================</p><p class="">==============================================================================</p><p class="">For petsc/master + OpenMPI
                                4.0.4 + MKL 2019.4.243:</p><p class="">With OpenMP=1</p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.03.02.02h00m02s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.03.02.02h00m02s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.03.02.02h00m02s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.03.02.02h00m02s_configure.log</a></p>
                              <pre style="font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; overflow-wrap: break-word; white-space: pre-wrap;" class=""># -------------
#   Summary    
# -------------
# FAILED snes_tutorials-ex12_quad_hpddm_reuse_baij diff-ksp_ksp_tests-ex33_superlu_dist_2 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1 ksp_ksp_tutorials-ex50_tut_2 diff-ksp_ksp_tests-ex33_superlu_dist diff-snes_tutorials-ex56_hypre snes_tutorials-ex17_3d_q3_trig_elas snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij ksp_ksp_tutorials-ex5_superlu_dist_3 ksp_ksp_tutorials-ex5f_superlu_dist snes_tutorials-ex12_tri_parmetis_hpddm_baij diff-snes_tutorials-ex19_tut_3 mat_tests-ex242_3 snes_tutorials-ex17_3d_q3_trig_vlap ksp_ksp_tutorials-ex5f_superlu_dist_3 snes_tutorials-ex19_superlu_dist diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre diff-ksp_ksp_tutorials-ex49_hypre_nullspace ts_tutorials-ex18_p1p1_xper_ref ts_tutorials-ex18_p1p1_xyper_ref snes_tutorials-ex19_superlu_dist_2 ksp_ksp_tutorials-ex5_superlu_dist_2 diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre ksp_ksp_tutorials-ex64_1 ksp_ksp_tutorials-ex5_superlu_dist ksp_ksp_tutorials-ex5f_superlu_dist_2
# success 8275/10003 tests (82.7%)
# <b class="">failed 33/10003</b> tests (0.3%)</pre><p class="">With OpenMP=0</p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.02.26.02h00m16s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.02.26.02h00m16s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.02.26.02h00m16s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.02.26.02h00m16s_configure.log</a><br class="">
                              </p>
                              <pre style="font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; overflow-wrap: break-word; white-space: pre-wrap;" class=""># -------------
#   Summary    
# -------------
# FAILED tao_constrained_tutorials-tomographyADMM_6 snes_tutorials-ex17_3d_q3_trig_elas mat_tests-ex242_3 snes_tutorials-ex17_3d_q3_trig_vlap tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_5
# success 8262/9983 tests (82.8%)
# <b class="">failed 6/9983</b> tests (0.1%)</pre><p class="">==============================================================================</p><p class="">==============================================================================</p><p class="">For OpenMPI 3.1.x/master:<br class="">
                              </p><p class="">With OpenMP=1:</p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.03.01.22h00m01s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.03.01.22h00m01s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.03.01.22h00m01s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.03.01.22h00m01s_configure.log</a></p>
                              <pre style="font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; overflow-wrap: break-word; white-space: pre-wrap;" class=""># -------------
#   Summary    
# -------------
# FAILED mat_tests-ex242_3 mat_tests-ex242_2 diff-mat_tests-ex219f_1 diff-dm_tutorials-ex11f90_1 ksp_ksp_tutorials-ex5_superlu_dist_3 diff-ksp_ksp_tutorials-ex49_hypre_nullspace ksp_ksp_tutorials-ex5f_superlu_dist_3 snes_tutorials-ex17_3d_q3_trig_vlap diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre diff-snes_tutorials-ex19_tut_3 diff-snes_tutorials-ex56_hypre diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_4 tao_constrained_tutorials-tomographyADMM_6 diff-tao_constrained_tutorials-toyf_1
# success 8142/9765 tests (83.4%)
# <b class="">failed 16/9765</b> tests (0.2%)</pre><p class="">With OpenMP=0:</p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.02.28.22h00m02s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.02.28.22h00m02s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.02.28.22h00m02s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.02.28.22h00m02s_configure.log</a><br class="">
                              </p>
                              <pre style="font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; overflow-wrap: break-word; white-space: pre-wrap;" class=""># -------------
#   Summary    
# -------------
# FAILED mat_tests-ex242_3 mat_tests-ex242_2 diff-mat_tests-ex219f_1 diff-dm_tutorials-ex11f90_1 ksp_ksp_tutorials-ex56_2 snes_tutorials-ex17_3d_q3_trig_vlap tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_4 diff-tao_constrained_tutorials-toyf_1
# success 8151/9767 tests (83.5%)
# <b class="">failed 9/9767</b> tests (0.1%)
</pre><p class="">==============================================================================</p><p class="">==============================================================================</p><p class="">For OpenMPI 4.0.x/master:</p><p class="">With OpenMP=1:</p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.03.01.20h00m01s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.03.01.20h00m01s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.03.01.20h00m01s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.03.01.20h00m01s_configure.log</a><br class="">
                              </p>
                              <pre style="font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; overflow-wrap: break-word; white-space: pre-wrap;" class=""># FAILED snes_tutorials-ex17_3d_q3_trig_elas snes_tutorials-ex19_hypre ksp_ksp_tutorials-ex56_2 tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_5 mat_tests-ex242_3 ksp_ksp_tutorials-ex55_hypre ksp_ksp_tutorials-ex5_superlu_dist_2 tao_constrained_tutorials-tomographyADMM_6 snes_tutorials-ex56_hypre snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre ksp_ksp_tutorials-ex5f_superlu_dist_3 ksp_ksp_tutorials-ex34_hyprestruct diff-ksp_ksp_tutorials-ex49_hypre_nullspace snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre ksp_ksp_tutorials-ex5f_superlu_dist ksp_ksp_tutorials-ex5f_superlu_dist_2 ksp_ksp_tutorials-ex5_superlu_dist snes_tutorials-ex19_tut_3 snes_tutorials-ex19_superlu_dist ksp_ksp_tutorials-ex50_tut_2 snes_tutorials-ex17_3d_q3_trig_vlap ksp_ksp_tutorials-ex5_superlu_dist_3 snes_tutorials-ex19_superlu_dist_2 tao_constrained_tutorials-tomographyADMM_4 ts_tutorials-ex26_2
# success 8125/9753 tests (83.3%)
# <b class="">failed 26/9753</b> tests (0.3%)</pre><p class="">With OpenMP=0</p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.02.28.20h00m04s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.02.28.20h00m04s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.02.28.20h00m04s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.02.28.20h00m04s_configure.log</a><br class="">
                              </p>
                              <pre style="font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; overflow-wrap: break-word; white-space: pre-wrap;" class=""># FAILED mat_tests-ex242_3
# success 8174/9777 tests (83.6%)
# <b class="">failed 1/9777</b> tests (0.0%)

</pre><p class="">==============================================================================</p><p class="">==============================================================================</p><p class="">Is that known and normal?</p><p class="">In all cases, I am using MKL
                                and I suspect it  may come from there...
                                :/</p><p class="">I also saw a second problem,
                                "make test" fails to compile petsc
                                examples on older versions of MKL (but
                                that's less important for me, I just
                                upgraded to OneAPI to avoid this, but
                                you may want to know):<br class="">
                              </p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/dernier_ompi/2021.03.02.02h16m01s_make_test.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/dernier_ompi/2021.03.02.02h16m01s_make_test.log</a></p><p class=""><a class="moz-txt-link-freetext" href="https://giref.ulaval.ca/~cmpgiref/dernier_ompi/2021.03.02.02h16m01s_configure.log" moz-do-not-send="true">https://giref.ulaval.ca/~cmpgiref/dernier_ompi/2021.03.02.02h16m01s_configure.log</a><br class="">
                              </p><p class=""> Thanks,</p><p class="">Eric<br class="">
                              </p>
                              <pre class="moz-signature" cols="72">-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42</pre>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br class="">
                    </div>
                  </div>
                </blockquote>
                <pre class="moz-signature" cols="72">-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42</pre>
              </div>
            </div>
          </blockquote>
        </div>
        <br class="">
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42</pre>
  </div>

</div></blockquote></div><br class=""></body></html>