<div dir="ltr">Thanks Satish, I tried the procedure you suggested and I get the same performance, so I guess that MKL is not a problem in this case (I agree with you that it has to be improved though... my makefile is a little chaotic with all the libraries that I use).<div><br></div><div>And thanks Barry and Matthew! I'll try to ask to the Intel compiler forum since I also think that this is a problem related to the compiler and if I make some advance I'll let you know! In the end, I guess I'll drop acceleration through OpenMP threads...</div><div><br></div><div>Thanks all!</div><div><br></div><div>Adrian.<br><div class="gmail_extra"><br><div class="gmail_quote">2018-03-02 17:11 GMT+01:00 Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">When using MKL - PETSc attempts to default to sequential MKL.<br>
<br>
Perhaps this pulls in a *conflicting* dependency against -liomp5 - and<br>
one has to use threaded MKL for this case. i.e not use<br>
-lmkl_sequential<br>
<br>
You appear to have multiple mkl libraires linked in - its not clear<br>
what they are for - and if there are any conflicts there.<br>
<span class=""><br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>mkl/lib/intel64<br>
> -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpetsc -lmkl_intel_lp64<br>
> -lmkl_intel_thread -lmkl_core -lmkl_lapack95_lp64 -liomp5 -lpthread -lm<br>
<br>
</span>> -lmkl_intel_lp64 lmkl_sequential -lmkl_core -lpthread<br>
<br>
To test this out - suggest rebuilding PETSc with<br>
--download-fblaslapack [and no mkl or related pacakges] - and then run<br>
this test case you have [with openmp]<br>
<br>
And then add back one mkl package at a time..<br>
<span class="HOEnZb"><font color="#888888"><br>
Satish<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
On Fri, 2 Mar 2018, Adrián Amor wrote:<br>
<br>
> Hi all,<br>
><br>
> I have been working in the last months with PETSC in a FEM program written<br>
> on FORTRAN, so far sequential. Now, I want to parallelize it with OpenMP<br>
> and I have found some problems. Finally, I have built a mockup program<br>
> trying to localize the error.<br>
><br>
> 1. I have compiled PETSC with these options:<br>
> ./configure --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort<br>
> --with-blas-lapack-dir=/opt/<wbr>intel/mkl/lib/intel64/ --with-debugging=1<br>
> --with-scalar-type=complex --with-threadcomm --with-pthreadclasses<br>
> --with-openmp<br>
> --with-openmp-include=/opt/<wbr>intel/compilers_and_libraries_<wbr>2016.1.150/linux/compiler/lib/<wbr>intel64_lin<br>
> --with-openmp-lib=/opt/intel/<wbr>compilers_and_libraries_2016.<wbr>1.150/linux/compiler/lib/<wbr>intel64_lin/libiomp5.a<br>
> PETSC_ARCH=linux-intel-dbg PETSC-AVOID-MPIF-H=1<br>
><br>
> (I have tried also removing   --with-threadcomm --with-pthreadclasses and<br>
> with libiomp5.so).<br>
><br>
> 2. The program to be executed is composed of two files, one is<br>
> hellocount.F90:<br>
> MODULE hello_count<br>
>   use omp_lib<br>
>   IMPLICIT none<br>
><br>
>   CONTAINS<br>
>   subroutine hello_print ()<br>
>      integer :: nthreads,mythread<br>
><br>
>    !pragma hello-who-omp-f<br>
>    !$omp parallel<br>
>      nthreads = omp_get_num_threads()<br>
>      mythread = omp_get_thread_num()<br>
>      write(*,'("Hello from",i3," out of",i3)') mythread,nthreads<br>
>    !$omp end parallel<br>
>    !pragma end<br>
>    end subroutine hello_print<br>
> END MODULE hello_count<br>
><br>
> and the other one is hellocount_main.F90:<br>
> Program Hello<br>
><br>
>    USE hello_count<br>
><br>
>    call hello_print<br>
><br>
>    STOP<br>
><br>
> end Program Hello<br>
><br>
> 3. To compile these two functions I use:<br>
> rm -rf _obj<br>
> mkdir _obj<br>
><br>
> ifort -E -I/home/aamor/petsc/include<br>
> -I/home/aamor/petsc/linux-<wbr>intel-dbg/include -c hellocount.F90<br>
> >_obj/hellocount.f90<br>
> ifort -E -I/home/aamor/petsc/include<br>
> -I/home/aamor/petsc/linux-<wbr>intel-dbg/include -c hellocount_main.F90<br>
> >_obj/hellocount_main.f90<br>
><br>
> mpiifort -CB -g -warn all -O0 -shared-intel -check:none -qopenmp -module<br>
> _obj -I./_obj -I/home/aamor/MUMPS_5.1.2/<wbr>include<br>
>  -I/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>mkl/include<br>
> -I/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>mkl/include/intel64/lp64/<br>
> -I/home/aamor/petsc/include -I/home/aamor/petsc/linux-<wbr>intel-dbg/include -o<br>
> _obj/hellocount.o -c _obj/hellocount.f90<br>
> mpiifort -CB -g -warn all -O0 -shared-intel -check:none -qopenmp -module<br>
> _obj -I./_obj -I/home/aamor/MUMPS_5.1.2/<wbr>include<br>
>  -I/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>mkl/include<br>
> -I/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>mkl/include/intel64/lp64/<br>
> -I/home/aamor/petsc/include -I/home/aamor/petsc/linux-<wbr>intel-dbg/include -o<br>
> _obj/hellocount_main.o -c _obj/hellocount_main.f90<br>
><br>
> mpiifort -CB -g -warn all -O0 -shared-intel -check:none -qopenmp -module<br>
> _obj -I./_obj -o exec/HELLO _obj/hellocount.o _obj/hellocount_main.o<br>
> /home/aamor/lib_tmp/libarpack_<wbr>LinuxIntel15.a<br>
> /home/aamor/MUMPS_5.1.2/lib/<wbr>libzmumps.a<br>
> /home/aamor/MUMPS_5.1.2/lib/<wbr>libmumps_common.a<br>
> /home/aamor/MUMPS_5.1.2/lib/<wbr>libpord.a<br>
> /home/aamor/parmetis-4.0.3/<wbr>lib/libparmetis.a<br>
> /home/aamor/parmetis-4.0.3/<wbr>lib/libmetis.a<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>mkl/lib/intel64<br>
> -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpetsc -lmkl_intel_lp64<br>
> -lmkl_intel_thread -lmkl_core -lmkl_lapack95_lp64 -liomp5 -lpthread -lm<br>
> -L/home/aamor/lib_tmp -lgidpost -lz /home/aamor/lua-5.3.3/src/<wbr>liblua.a<br>
> /home/aamor/ESEAS-master/<wbr>libeseas.a<br>
> -Wl,-rpath,/home/aamor/petsc/<wbr>linux-intel-dbg/lib<br>
> -L/home/aamor/petsc/linux-<wbr>intel-dbg/lib<br>
> -Wl,-rpath,/opt/intel/mkl/lib/<wbr>intel64 -L/opt/intel/mkl/lib/intel64<br>
> -Wl,-rpath,/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.<wbr>1.2.150/intel64/lib/debug_mt</a> -L/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib/debug_mt</a> -Wl,-rpath,/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib</a> -L/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/<wbr>intel64/lib</a><br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016/<wbr>linux/mkl/lib/intel64<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016/linux/mkl/lib/<wbr>intel64<br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016.<wbr>1.150/linux/compiler/lib/<wbr>intel64_lin<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>compiler/lib/intel64_lin<br>
> -Wl,-rpath,/usr/lib/gcc/x86_<wbr>64-redhat-linux/4.4.7<br>
> -L/usr/lib/gcc/x86_64-redhat-<wbr>linux/4.4.7<br>
> -Wl,-rpath,/opt/intel/mpi-rt/<wbr>5.1/intel64/lib/debug_mt<br>
> -Wl,-rpath,/opt/intel/mpi-rt/<wbr>5.1/intel64/lib -lmkl_intel_lp64<br>
> -lmkl_sequential -lmkl_core -lpthread -lX11 -lssl -lcrypto -lifport<br>
> -lifcore_pic -lmpicxx -ldl -Wl,-rpath,/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib/debug_mt</a> -L/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib/debug_mt</a> -Wl,-rpath,/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib</a> -L/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/<wbr>intel64/lib</a> -lmpifort<br>
> -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib/debug_mt</a> -L/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib/debug_mt</a> -Wl,-rpath,/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib</a> -L/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/<wbr>intel64/lib</a><br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016/<wbr>linux/mkl/lib/intel64<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016/linux/mkl/lib/<wbr>intel64<br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016.<wbr>1.150/linux/compiler/lib/<wbr>intel64_lin<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>compiler/lib/intel64_lin<br>
> -Wl,-rpath,/usr/lib/gcc/x86_<wbr>64-redhat-linux/4.4.7<br>
> -L/usr/lib/gcc/x86_64-redhat-<wbr>linux/4.4.7<br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016/<wbr>linux/mkl/lib/intel64<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016/linux/mkl/lib/<wbr>intel64<br>
> -Wl,-rpath,/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.<wbr>1.2.150/intel64/lib/debug_mt</a><br>
> -Wl,-rpath,/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.<wbr>1.2.150/intel64/lib</a><br>
> -Wl,-rpath,/opt/intel/mpi-rt/<wbr>5.1/intel64/lib/debug_mt<br>
> -Wl,-rpath,/opt/intel/mpi-rt/<wbr>5.1/intel64/lib -limf -lsvml -lirng -lm -lipgo<br>
> -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s<br>
> -Wl,-rpath,/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.<wbr>1.2.150/intel64/lib/debug_mt</a> -L/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib/debug_mt" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib/debug_mt</a> -Wl,-rpath,/opt/intel/impi/<br>
> <a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/intel64/lib</a> -L/opt/intel/impi/<a href="http://5.1.2.150/intel64/lib" rel="noreferrer" target="_blank">5.1.2.150/<wbr>intel64/lib</a><br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016/<wbr>linux/mkl/lib/intel64<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016/linux/mkl/lib/<wbr>intel64<br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016.<wbr>1.150/linux/compiler/lib/<wbr>intel64_lin<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016.1.150/linux/<wbr>compiler/lib/intel64_lin<br>
> -Wl,-rpath,/usr/lib/gcc/x86_<wbr>64-redhat-linux/4.4.7<br>
> -L/usr/lib/gcc/x86_64-redhat-<wbr>linux/4.4.7<br>
> -Wl,-rpath,/opt/intel/<wbr>compilers_and_libraries_2016/<wbr>linux/mkl/lib/intel64<br>
> -L/opt/intel/compilers_and_<wbr>libraries_2016/linux/mkl/lib/<wbr>intel64 -ldl<br>
><br>
> exec/HELLO<br>
><br>
> 4. Then I have seen that:<br>
> 4.1. If I set OMP_NUM_THREADS=2 and I remove -lpetsc and -lifcore_pic from<br>
> the last step, I got:<br>
> Hello from  0 out of  2<br>
> Hello from  1 out of  2<br>
> 4.2 But if add -lpetsc and -lifcore_pic (because I want to use PETSC) I get<br>
> this error:<br>
> Hello from  0 out of  2<br>
> forrtl: severe (40): recursive I/O operation, unit -1, file unknown<br>
> Image              PC                Routine            Line        Source<br>
> HELLO              000000000041665C  Unknown               Unknown  Unknown<br>
> HELLO              00000000004083C8  Unknown               Unknown  Unknown<br>
> libiomp5.so        00007F9C603566A3  Unknown               Unknown  Unknown<br>
> libiomp5.so        00007F9C60325007  Unknown               Unknown  Unknown<br>
> libiomp5.so        00007F9C603246F5  Unknown               Unknown  Unknown<br>
> libiomp5.so        00007F9C603569C3  Unknown               Unknown  Unknown<br>
> libpthread.so.0    0000003CE76079D1  Unknown               Unknown  Unknown<br>
> libc.so.6          0000003CE6AE88FD  Unknown               Unknown  Unknown<br>
> If you set OMP_NUM_THREADS to 8, I get:<br>
> forrtl: severe (40): recursive I/O operation, unit -1, file unknown<br>
> forrtl: severe (40): recursive I/O operation, unit -1, file unknown<br>
> forrtl: severe (40): recursive I/O operation, unit -1, file unknown<br>
><br>
> I am sorry if this is a trivial problem because I guess that lots of people<br>
> use PETSC with OpenMP in FORTRAN, but I have really done my best to figure<br>
> out where the error is. Can you help me?<br>
><br>
> Thanks a lot!<br>
><br>
> Adrian.<br>
><br>
</div></div></blockquote></div><br></div></div></div>