[petsc-users] Using PETSC with an openMP program
Adrián Amor
aamor at pa.uc3m.es
Mon Mar 12 06:32:25 CDT 2018
Satish,
I checked with the Intel support team and they told me that "Fortran does
not allow what it calls "recursive I/O" (except for internal files) - once
you start an I/O operation on a unit no other operation on that unit may
begin".
So the use of directive !$OMP CRITICAL is necesssary. The reason behind I
get that recursive I/O in my case linking with PETSC and I don't without, I
guess it would be that the number of linking programs is too big and maybe
the I/O operation becomes slower... but everything is solved! Thanks for
worrying about the problem! I already have my code parallelized with OpenMP
and I make use of PETSC without further problems.
Thanks!
Adrian.
2018-03-02 20:25 GMT+01:00 Satish Balay <balay at mcs.anl.gov>:
> I just tried your test code with gfortran [without petsc] - and I
> don't understand it. Does gfortran not support this openmp usage?
>
> [tried gfortran 4.8.4 and 7.3.1]
>
> balay at es^/sandbox/balay/omp $ gfortran -fopenmp -c hellocount
> hellocount.F90 hellocount_main.F90
> balay at es^/sandbox/balay/omp $ gfortran -fopenmp -c hellocount.F90
> balay at es^/sandbox/balay/omp $ gfortran -fopenmp hellocount_main.F90
> hellocount.o
> balay at es^/sandbox/balay/omp $ ./a.out
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 11 out of 32
> Hello from 14 out of 32
> Hello from 14 out of 32
> Hello from 14 out of 32
> Hello from 14 out of 32
> Hello from 14 out of 32
> Hello from 14 out of 32
> Hello from 14 out of 32
>
> ifort compiled test appears to behave correctly
>
> balay at es^/sandbox/balay/omp $ ifort -qopenmp -c hellocount.F90
> balay at es^/sandbox/balay/omp $ ifort -qopenmp hellocount_main.F90
> hellocount.o
> balay at es^/sandbox/balay/omp $ ./a.out |sort -n
> Hello from 0 out of 32
> Hello from 10 out of 32
> Hello from 11 out of 32
> Hello from 12 out of 32
> Hello from 13 out of 32
> Hello from 14 out of 32
> Hello from 15 out of 32
> Hello from 16 out of 32
> Hello from 17 out of 32
> Hello from 18 out of 32
> Hello from 19 out of 32
> Hello from 1 out of 32
> Hello from 20 out of 32
> Hello from 21 out of 32
> Hello from 22 out of 32
> Hello from 23 out of 32
> Hello from 24 out of 32
> Hello from 25 out of 32
> Hello from 26 out of 32
> Hello from 27 out of 32
> Hello from 28 out of 32
> Hello from 29 out of 32
> Hello from 2 out of 32
> Hello from 30 out of 32
> Hello from 31 out of 32
> Hello from 3 out of 32
> Hello from 4 out of 32
> Hello from 5 out of 32
> Hello from 6 out of 32
> Hello from 7 out of 32
> Hello from 8 out of 32
> Hello from 9 out of 32
> balay at es^/sandbox/balay/omp
>
> Now I build petsc with:
>
> ./configure --with-cc=icc --with-mpi=0 --with-openmp --with-fc=0
> --with-cxx=0 PETSC_ARCH=arch-omp
>
> i.e
> balay at es^/sandbox/balay/omp $ ldd /sandbox/balay/petsc/arch-omp/
> lib/libpetsc.so
> linux-vdso.so.1 => (0x00007fff8bfb2000)
> liblapack.so.3 => /usr/lib/liblapack.so.3 (0x00007f513fbbf000)
> libblas.so.3 => /usr/lib/libblas.so.3 (0x00007f513e3b6000)
> libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6
> (0x00007f513e081000)
> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> (0x00007f513de63000)
> libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
> (0x00007f513dc5f000)
> libimf.so => /soft/com/packages/intel/16/u3/lib/intel64/libimf.so
> (0x00007f513d761000)
> libsvml.so => /soft/com/packages/intel/16/u3/lib/intel64/libsvml.so
> (0x00007f513c855000)
> libirng.so => /soft/com/packages/intel/16/u3/lib/intel64/libirng.so
> (0x00007f513c4e3000)
> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f513c1dd000)
> libiomp5.so => /soft/com/packages/intel/16/u3/lib/intel64/libiomp5.so
> (0x00007f513be99000)
> libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
> (0x00007f513bc83000)
> libintlc.so.5 => /soft/com/packages/intel/16/u3/lib/intel64/libintlc.so.5
> (0x00007f513ba17000)
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f513b64e000)
> libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3
> (0x00007f513b334000)
> libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1
> (0x00007f513b115000)
> /lib64/ld-linux-x86-64.so.2 (0x00007f5142b40000)
> libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0
> (0x00007f513aed9000)
> libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6
> (0x00007f513acd5000)
> libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6
> (0x00007f513aacf000)
>
>
> And - then link in petsc with your test - and that works fine for me.
>
> balay at es^/sandbox/balay/omp $ rm -f *.o *.mod
> balay at es^/sandbox/balay/omp $ ifort -qopenmp -c hellocount.F90
> balay at es^/sandbox/balay/omp $ ifort -qopenmp hellocount_main.F90
> hellocount.o -Wl,-rpath,/sandbox/balay/petsc/arch-omp/lib
> -L/sandbox/balay/petsc/arch-omp/lib -lpetsc -liomp5
> balay at es^/sandbox/balay/omp $ ./a.out |sort -n
> Hello from 0 out of 32
> Hello from 10 out of 32
> Hello from 11 out of 32
> Hello from 12 out of 32
> Hello from 13 out of 32
> Hello from 14 out of 32
> Hello from 15 out of 32
> Hello from 16 out of 32
> Hello from 17 out of 32
> Hello from 18 out of 32
> Hello from 19 out of 32
> Hello from 1 out of 32
> Hello from 20 out of 32
> Hello from 21 out of 32
> Hello from 22 out of 32
> Hello from 23 out of 32
> Hello from 24 out of 32
> Hello from 25 out of 32
> Hello from 26 out of 32
> Hello from 27 out of 32
> Hello from 28 out of 32
> Hello from 29 out of 32
> Hello from 2 out of 32
> Hello from 30 out of 32
> Hello from 31 out of 32
> Hello from 3 out of 32
> Hello from 4 out of 32
> Hello from 5 out of 32
> Hello from 6 out of 32
> Hello from 7 out of 32
> Hello from 8 out of 32
> Hello from 9 out of 32
>
> Satish
>
>
> On Fri, 2 Mar 2018, Adrián Amor wrote:
>
> > Thanks Satish, I tried the procedure you suggested and I get the same
> > performance, so I guess that MKL is not a problem in this case (I agree
> > with you that it has to be improved though... my makefile is a little
> > chaotic with all the libraries that I use).
> >
> > And thanks Barry and Matthew! I'll try to ask to the Intel compiler forum
> > since I also think that this is a problem related to the compiler and if
> I
> > make some advance I'll let you know! In the end, I guess I'll drop
> > acceleration through OpenMP threads...
> >
> > Thanks all!
> >
> > Adrian.
> >
> > 2018-03-02 17:11 GMT+01:00 Satish Balay <balay at mcs.anl.gov>:
> >
> > > When using MKL - PETSc attempts to default to sequential MKL.
> > >
> > > Perhaps this pulls in a *conflicting* dependency against -liomp5 - and
> > > one has to use threaded MKL for this case. i.e not use
> > > -lmkl_sequential
> > >
> > > You appear to have multiple mkl libraires linked in - its not clear
> > > what they are for - and if there are any conflicts there.
> > >
> > > > -L/opt/intel/compilers_and_libraries_2016.1.150/linux/
> mkl/lib/intel64
> > > > -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpetsc
> -lmkl_intel_lp64
> > > > -lmkl_intel_thread -lmkl_core -lmkl_lapack95_lp64 -liomp5 -lpthread
> -lm
> > >
> > > > -lmkl_intel_lp64 lmkl_sequential -lmkl_core -lpthread
> > >
> > > To test this out - suggest rebuilding PETSc with
> > > --download-fblaslapack [and no mkl or related pacakges] - and then run
> > > this test case you have [with openmp]
> > >
> > > And then add back one mkl package at a time..
> > >
> > > Satish
> > >
> > >
> > > On Fri, 2 Mar 2018, Adrián Amor wrote:
> > >
> > > > Hi all,
> > > >
> > > > I have been working in the last months with PETSC in a FEM program
> > > written
> > > > on FORTRAN, so far sequential. Now, I want to parallelize it with
> OpenMP
> > > > and I have found some problems. Finally, I have built a mockup
> program
> > > > trying to localize the error.
> > > >
> > > > 1. I have compiled PETSC with these options:
> > > > ./configure --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort
> > > > --with-blas-lapack-dir=/opt/intel/mkl/lib/intel64/
> --with-debugging=1
> > > > --with-scalar-type=complex --with-threadcomm --with-pthreadclasses
> > > > --with-openmp
> > > > --with-openmp-include=/opt/intel/compilers_and_libraries_
> > > 2016.1.150/linux/compiler/lib/intel64_lin
> > > > --with-openmp-lib=/opt/intel/compilers_and_libraries_2016.
> > > 1.150/linux/compiler/lib/intel64_lin/libiomp5.a
> > > > PETSC_ARCH=linux-intel-dbg PETSC-AVOID-MPIF-H=1
> > > >
> > > > (I have tried also removing --with-threadcomm
> --with-pthreadclasses and
> > > > with libiomp5.so).
> > > >
> > > > 2. The program to be executed is composed of two files, one is
> > > > hellocount.F90:
> > > > MODULE hello_count
> > > > use omp_lib
> > > > IMPLICIT none
> > > >
> > > > CONTAINS
> > > > subroutine hello_print ()
> > > > integer :: nthreads,mythread
> > > >
> > > > !pragma hello-who-omp-f
> > > > !$omp parallel
> > > > nthreads = omp_get_num_threads()
> > > > mythread = omp_get_thread_num()
> > > > write(*,'("Hello from",i3," out of",i3)') mythread,nthreads
> > > > !$omp end parallel
> > > > !pragma end
> > > > end subroutine hello_print
> > > > END MODULE hello_count
> > > >
> > > > and the other one is hellocount_main.F90:
> > > > Program Hello
> > > >
> > > > USE hello_count
> > > >
> > > > call hello_print
> > > >
> > > > STOP
> > > >
> > > > end Program Hello
> > > >
> > > > 3. To compile these two functions I use:
> > > > rm -rf _obj
> > > > mkdir _obj
> > > >
> > > > ifort -E -I/home/aamor/petsc/include
> > > > -I/home/aamor/petsc/linux-intel-dbg/include -c hellocount.F90
> > > > >_obj/hellocount.f90
> > > > ifort -E -I/home/aamor/petsc/include
> > > > -I/home/aamor/petsc/linux-intel-dbg/include -c hellocount_main.F90
> > > > >_obj/hellocount_main.f90
> > > >
> > > > mpiifort -CB -g -warn all -O0 -shared-intel -check:none -qopenmp
> -module
> > > > _obj -I./_obj -I/home/aamor/MUMPS_5.1.2/include
> > > > -I/opt/intel/compilers_and_libraries_2016.1.150/linux/mkl/include
> > > > -I/opt/intel/compilers_and_libraries_2016.1.150/linux/
> > > mkl/include/intel64/lp64/
> > > > -I/home/aamor/petsc/include -I/home/aamor/petsc/linux-
> intel-dbg/include
> > > -o
> > > > _obj/hellocount.o -c _obj/hellocount.f90
> > > > mpiifort -CB -g -warn all -O0 -shared-intel -check:none -qopenmp
> -module
> > > > _obj -I./_obj -I/home/aamor/MUMPS_5.1.2/include
> > > > -I/opt/intel/compilers_and_libraries_2016.1.150/linux/mkl/include
> > > > -I/opt/intel/compilers_and_libraries_2016.1.150/linux/
> > > mkl/include/intel64/lp64/
> > > > -I/home/aamor/petsc/include -I/home/aamor/petsc/linux-
> intel-dbg/include
> > > -o
> > > > _obj/hellocount_main.o -c _obj/hellocount_main.f90
> > > >
> > > > mpiifort -CB -g -warn all -O0 -shared-intel -check:none -qopenmp
> -module
> > > > _obj -I./_obj -o exec/HELLO _obj/hellocount.o _obj/hellocount_main.o
> > > > /home/aamor/lib_tmp/libarpack_LinuxIntel15.a
> > > > /home/aamor/MUMPS_5.1.2/lib/libzmumps.a
> > > > /home/aamor/MUMPS_5.1.2/lib/libmumps_common.a
> > > > /home/aamor/MUMPS_5.1.2/lib/libpord.a
> > > > /home/aamor/parmetis-4.0.3/lib/libparmetis.a
> > > > /home/aamor/parmetis-4.0.3/lib/libmetis.a
> > > > -L/opt/intel/compilers_and_libraries_2016.1.150/linux/
> mkl/lib/intel64
> > > > -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpetsc
> -lmkl_intel_lp64
> > > > -lmkl_intel_thread -lmkl_core -lmkl_lapack95_lp64 -liomp5 -lpthread
> -lm
> > > > -L/home/aamor/lib_tmp -lgidpost -lz /home/aamor/lua-5.3.3/src/
> liblua.a
> > > > /home/aamor/ESEAS-master/libeseas.a
> > > > -Wl,-rpath,/home/aamor/petsc/linux-intel-dbg/lib
> > > > -L/home/aamor/petsc/linux-intel-dbg/lib
> > > > -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64
> > > > -Wl,-rpath,/opt/intel/impi/5.1.2.150/intel64/lib/debug_mt
> > > -L/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib/debug_mt -Wl,-rpath,/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib -L/opt/intel/impi/5.1.2.150/intel64/lib
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016/
> linux/mkl/lib/intel64
> > > > -L/opt/intel/compilers_and_libraries_2016/linux/mkl/lib/intel64
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016.
> > > 1.150/linux/compiler/lib/intel64_lin
> > > > -L/opt/intel/compilers_and_libraries_2016.1.150/linux/
> > > compiler/lib/intel64_lin
> > > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7
> > > > -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7
> > > > -Wl,-rpath,/opt/intel/mpi-rt/5.1/intel64/lib/debug_mt
> > > > -Wl,-rpath,/opt/intel/mpi-rt/5.1/intel64/lib -lmkl_intel_lp64
> > > > -lmkl_sequential -lmkl_core -lpthread -lX11 -lssl -lcrypto -lifport
> > > > -lifcore_pic -lmpicxx -ldl -Wl,-rpath,/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib/debug_mt -L/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib/debug_mt -Wl,-rpath,/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib -L/opt/intel/impi/5.1.2.150/intel64/lib
> -lmpifort
> > > > -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib/debug_mt -L/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib/debug_mt -Wl,-rpath,/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib -L/opt/intel/impi/5.1.2.150/intel64/lib
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016/
> linux/mkl/lib/intel64
> > > > -L/opt/intel/compilers_and_libraries_2016/linux/mkl/lib/intel64
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016.
> > > 1.150/linux/compiler/lib/intel64_lin
> > > > -L/opt/intel/compilers_and_libraries_2016.1.150/linux/
> > > compiler/lib/intel64_lin
> > > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7
> > > > -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016/
> linux/mkl/lib/intel64
> > > > -L/opt/intel/compilers_and_libraries_2016/linux/mkl/lib/intel64
> > > > -Wl,-rpath,/opt/intel/impi/5.1.2.150/intel64/lib/debug_mt
> > > > -Wl,-rpath,/opt/intel/impi/5.1.2.150/intel64/lib
> > > > -Wl,-rpath,/opt/intel/mpi-rt/5.1/intel64/lib/debug_mt
> > > > -Wl,-rpath,/opt/intel/mpi-rt/5.1/intel64/lib -limf -lsvml -lirng -lm
> > > -lipgo
> > > > -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s
> > > > -Wl,-rpath,/opt/intel/impi/5.1.2.150/intel64/lib/debug_mt
> > > -L/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib/debug_mt -Wl,-rpath,/opt/intel/impi/
> > > > 5.1.2.150/intel64/lib -L/opt/intel/impi/5.1.2.150/intel64/lib
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016/
> linux/mkl/lib/intel64
> > > > -L/opt/intel/compilers_and_libraries_2016/linux/mkl/lib/intel64
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016.
> > > 1.150/linux/compiler/lib/intel64_lin
> > > > -L/opt/intel/compilers_and_libraries_2016.1.150/linux/
> > > compiler/lib/intel64_lin
> > > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7
> > > > -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7
> > > > -Wl,-rpath,/opt/intel/compilers_and_libraries_2016/
> linux/mkl/lib/intel64
> > > > -L/opt/intel/compilers_and_libraries_2016/linux/mkl/lib/intel64 -ldl
> > > >
> > > > exec/HELLO
> > > >
> > > > 4. Then I have seen that:
> > > > 4.1. If I set OMP_NUM_THREADS=2 and I remove -lpetsc and -lifcore_pic
> > > from
> > > > the last step, I got:
> > > > Hello from 0 out of 2
> > > > Hello from 1 out of 2
> > > > 4.2 But if add -lpetsc and -lifcore_pic (because I want to use
> PETSC) I
> > > get
> > > > this error:
> > > > Hello from 0 out of 2
> > > > forrtl: severe (40): recursive I/O operation, unit -1, file unknown
> > > > Image PC Routine Line
> > > Source
> > > > HELLO 000000000041665C Unknown Unknown
> > > Unknown
> > > > HELLO 00000000004083C8 Unknown Unknown
> > > Unknown
> > > > libiomp5.so 00007F9C603566A3 Unknown Unknown
> > > Unknown
> > > > libiomp5.so 00007F9C60325007 Unknown Unknown
> > > Unknown
> > > > libiomp5.so 00007F9C603246F5 Unknown Unknown
> > > Unknown
> > > > libiomp5.so 00007F9C603569C3 Unknown Unknown
> > > Unknown
> > > > libpthread.so.0 0000003CE76079D1 Unknown Unknown
> > > Unknown
> > > > libc.so.6 0000003CE6AE88FD Unknown Unknown
> > > Unknown
> > > > If you set OMP_NUM_THREADS to 8, I get:
> > > > forrtl: severe (40): recursive I/O operation, unit -1, file unknown
> > > > forrtl: severe (40): recursive I/O operation, unit -1, file unknown
> > > > forrtl: severe (40): recursive I/O operation, unit -1, file unknown
> > > >
> > > > I am sorry if this is a trivial problem because I guess that lots of
> > > people
> > > > use PETSC with OpenMP in FORTRAN, but I have really done my best to
> > > figure
> > > > out where the error is. Can you help me?
> > > >
> > > > Thanks a lot!
> > > >
> > > > Adrian.
> > > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180312/26f0d461/attachment-0001.html>
More information about the petsc-users
mailing list