[petsc-dev] Error on Fugaku

Satish Balay balay at mcs.anl.gov
Wed Apr 14 19:15:10 CDT 2021


I don't understand kokkos build errors.

For one - kokkos-kernels are taking long to build. And on the back end - I'm getting into time limit [and terminated]

Trying a build on the front-end - its running - for a long time..

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                    4016901 a04201    20   0   11.0g  10.7g  13876 R  99.3  11.6  66:22.35 ccpcompx                                                                                                                                    
The above compile process has been running for more than an hour. Ok - its finally done.

When I enable openmp - kokkos cmake completes - but I get errors at build time  [BTW: I don't understand difference between -fopenmp and -Kopenmp]

login6$ ../petsc.save/arch-linux2-c-opt/lib/petsc/conf/reconfigure-arch-linux2-c-opt.py  --with-p4est-dir=$HOME/p4est-install --with-zlib-dir=$HOME/p4est-install -download-kokkos --download-kokkos-kernelsx --download-kokkos-commit=origin/develop --download-kokkos-kernels-commit=origin/develop --download-cmake=https://github.com/Kitware/CMake/releases/download/v3.20.1/cmake-3.20.1.tar.gz '--download-kokkos-cmake-arguments=-DBUILD_TESTING=OFF -DKokkos_ENABLE_LIBDL=OFF -DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON -DKokkos_ENABLE_OPENMP=ON' --ignoreLinkOutput=1 --with-openmp=1


/a04201/petsc.x/arch-linux2-c-opt/externalpackages/git.kokkos/petsc-build/core/src -I/vol0004/ra010009/a04201/petsc.x/arch-linux2-c-opt/externalpackages/git.kokkos/core/src -I/vol0004/ra010009/a04201/petsc.x/arch-linux2-c-opt/externalpackages/git.kokkos/petsc-build -O -fopenmp -fPIC -fopenmp -O -fopenmp -fPIC -fopenmp -fopenmp -std=c++14 -o CMakeFiles/kokkoscore.dir/OpenMP/Kokkos_OpenMP_Task.cpp.o -c /vol0004/ra010009/a04201/petsc.x/arch-linux2-c-opt/externalpackages/git.kokkos/core/src/OpenMP/Kokkos_OpenMP_Task.cpp
gmake[2]: Leaving directory '/vol0004/ra010009/a04201/petsc.x/arch-linux2-c-opt/externalpackages/git.kokkos/petsc-build'
gmake[1]: Leaving directory '/vol0004/ra010009/a04201/petsc.x/arch-linux2-c-opt/externalpackages/git.kokkos/petsc-build'/home/ra010009/a04201/asmqkagl0.s: Assembler messages:
/home/ra010009/a04201/asmqkagl0.s:20333: Error: symbol `.LEHB41' is already defined
/home/ra010009/a04201/asmqkagl0.s:21334: Error: symbol `.LEHB41' is already defined
gmake[2]: *** [core/src/CMakeFiles/kokkoscore.dir/build.make:312: core/src/CMakeFiles/kokkoscore.dir/OpenMP/Kokkos_OpenMP_Exec.cpp.o] Error 1

Note: I have a successful configure run with p4est+kokkos+kokkos-kernels, without-openmp
[for this - I had to build p4est on the back-end, then switch over to front-end - and build the reset of the packages]

Now I'm stuck at compiling petsc sources - src/vec/is/sf/impls/basic/kokkos/sfkok.kokkos.cxx

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                      56491 a04201    20   0 4944176   4.5g  11216 R  98.4   4.9  12:45.52 ccpcompx                                                                                                                                   

Satish

On Wed, 14 Apr 2021, Mark Adams wrote:

> PETSc seems to be skipping the compiler that it should use:
> 
> -- Check for working CXX compiler:
> /opt/FJSVxtclanga/tcsds-1.2.31/bin/mpiFCC - skipped
> 
> I see -fopenmp was added. This might be wrong. I use -Kopenmp.
> 
> --help says:
> 
>        -fopenmp
> 
>        The -fopenmp option specifies to enable Specification of OpenMP
> Appli-
>        cation Program Interface.
> 
>        When the -fopenmp option is specified, -mt is set.
> 
>        The -fopenmp option is needed if an object program compiled with the
>        -fopenmp option exists in the command line as input files.
> 
> Any ideas?
> 
> 
> On Wed, Apr 14, 2021 at 5:17 PM Satish Balay <balay at mcs.anl.gov> wrote:
> 
> > The following is working for me  on the compute node [its completed kokkos
> > (core) build - and is now building kokkos-kernels
> >
> >  ~/petsc.save/arch-arm/lib/petsc/conf/reconfigure-arch-arm.py
> > --download-p4est --download-zlib --download-kokkos
> > --download-kokkos-kernels --download-kokkos-commit=origin/develop
> > --download-cmake=
> > https://github.com/Kitware/CMake/releases/download/v3.20.1/cmake-3.20.1.tar.gz
> >
> > -DKokkos_ENABLE_OPENMP=ON give me some issue - so I didn't try any of the
> > additional options..
> >
> > Satish
> >
> > On Wed, 14 Apr 2021, Mark Adams wrote:
> >
> > > Satish, I get this error.
> > >
> > > I wonder if this syntax is wrong:
> > >
> > >     '--download-kokkos-cmake-arguments=-DCMAKE_BUILD_TYPE=Release
> > > -DBUILD_TESTING=OFF -DKokkos_ENABLE_LIBDL=OFF -DKokkos_ENABLE_OPENMP=ON
> > > -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON',
> > >
> > >
> > >
> > =============================================================================================
> > >                               Configuring KOKKOS with cmake; this may
> > take
> > > several minutes
> > >
> > >
> > =============================================================================================
> > >
> > > Executing: /usr/bin/cmake ..
> > > -DCMAKE_INSTALL_PREFIX=/home/ra010009/a04199/petsc/arch-arm
> > >
> > -DCMAKE_INSTALL_NAME_DIR:STRING="/home/ra010009/a04199/petsc/arch-arm/lib"
> > > -DCMAKE_INSTALL_LIBDIR:STRING="lib" -DCMAKE_VERBOSE_MAKEFILE=1
> > > -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER="mpifcc"
> > > -DMPI_C_COMPILER="mpifcc" -DCMAKE_AR=/usr/bin/ar
> > > -DCMAKE_RANLIB=/usr/bin/ranlib -DCMAKE_C_FLAGS:STRING="-fPIC
> > -Kfast,openmp
> > > -fopenmp" -DCMAKE_C_FLAGS_DEBUG:STRING="-fPIC -Kfast,openmp -fopenmp"
> > > -DCMAKE_C_FLAGS_RELEASE:STRING="-fPIC -Kfast,openmp -fopenmp"
> > > -DCMAKE_CXX_COMPILER="mpiFCC" -DMPI_CXX_COMPILER="mpiFCC"
> > > -DCMAKE_CXX_FLAGS:STRING="-Kfast,openmp -fopenmp -fPIC -fopenmp"
> > > -DCMAKE_CXX_FLAGS_DEBUG:STRING="-Kfast,openmp -fopenmp -fPIC -fopenmp"
> > > -DCMAKE_CXX_FLAGS_RELEASE:STRING="-Kfast,openmp -fopenmp -fPIC -fopenmp"
> > > -DCMAKE_Fortran_COMPILER="mpifrt" -DMPI_Fortran_COMPILER="mpifrt"
> > > -DCMAKE_Fortran_FLAGS:STRING="-fPIC -O -fopenmp"
> > > -DCMAKE_Fortran_FLAGS_DEBUG:STRING="-fPIC -O -fopenmp"
> > > -DCMAKE_Fortran_FLAGS_RELEASE:STRING="-fPIC -O -fopenmp"
> > > -DCMAKE_EXE_LINKER_FLAGS:STRING=" -fopenmp" -DBUILD_SHARED_LIBS:BOOL=ON
> > > -DUSE_XSDK_DEFAULTS=YES -DXSDK_ENABLE_DEBUG=NO
> > > -DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=ON
> > > -DCMAKE_BUILD_WITH_INSTALL_RPATH:BOOL=ON -DKokkos_ENABLE_MPI=ON
> > > -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_OPENMP=ON
> > > -DCMAKE_CXX_STANDARD="14" -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF
> > > -DKokkos_ENABLE_LIBDL=OFF -DKokkos_ENABLE_OPENMP=ON
> > > -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON
> > > stdout:
> > > -- Setting default Kokkos CXX standard to 14
> > > -- The CXX compiler identification is Fujitsu
> > > -- Check for working CXX compiler:
> > /opt/FJSVxtclanga/tcsds-1.2.31/bin/mpiFCC
> > > -- Check for working CXX compiler:
> > > /opt/FJSVxtclanga/tcsds-1.2.31/bin/mpiFCC -- works
> > > -- Detecting CXX compiler ABI info
> > > -- Detecting CXX compiler ABI info - done
> > > -- The project name is: Kokkos
> > > -- Configuring incomplete, errors occurred!
> > > See also
> > >
> > "/vol0004/ra010009/a04199/petsc/arch-arm/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeOutput.log".
> > >                     Error configuring KOKKOS with cmake Could not execute
> > > "['/usr/bin/cmake ..
> > > -DCMAKE_INSTALL_PREFIX=/home/ra010009/a04199/petsc/arch-arm
> > >
> > -DCMAKE_INSTALL_NAME_DIR:STRING="/home/ra010009/a04199/petsc/arch-arm/lib"
> > > -DCMAKE_INSTALL_LIBDIR:STRING="lib" -DCMAKE_VERBOSE_MAKEFILE=1
> > > -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER="mpifcc"
> > > -DMPI_C_COMPILER="mpifcc" -DCMAKE_AR=/usr/bin/ar
> > > -DCMAKE_RANLIB=/usr/bin/ranlib -DCMAKE_C_FLAGS:STRING="-fPIC
> > -Kfast,openmp
> > > -fopenmp" -DCMAKE_C_FLAGS_DEBUG:STRING="-fPIC -Kfast,openmp -fopenmp"
> > > -DCMAKE_C_FLAGS_RELEASE:STRING="-fPIC -Kfast,openmp -fopenmp"
> > > -DCMAKE_CXX_COMPILER="mpiFCC" -DMPI_CXX_COMPILER="mpiFCC"
> > > -DCMAKE_CXX_FLAGS:STRING="-Kfast,openmp -fopenmp -fPIC -fopenmp"
> > > -DCMAKE_CXX_FLAGS_DEBUG:STRING="-Kfast,openmp -fopenmp -fPIC -fopenmp"
> > > -DCMAKE_CXX_FLAGS_RELEASE:STRING="-Kfast,openmp -fopenmp -fPIC -fopenmp"
> > > -DCMAKE_Fortran_COMPILER="mpifrt" -DMPI_Fortran_COMPILER="mpifrt"
> > > -DCMAKE_Fortran_FLAGS:STRING="-fPIC -O -fopenmp"
> > > -DCMAKE_Fortran_FLAGS_DEBUG:STRING="-fPIC -O -fopenmp"
> > > -DCMAKE_Fortran_FLAGS_RELEASE:STRING="-fPIC -O -fopenmp"
> > > -DCMAKE_EXE_LINKER_FLAGS:STRING=" -fopenmp" -DBUILD_SHARED_LIBS:BOOL=ON
> > > -DUSE_XSDK_DEFAULTS=YES -DXSDK_ENABLE_DEBUG=NO
> > > -DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=ON
> > > -DCMAKE_BUILD_WITH_INSTALL_RPATH:BOOL=ON -DKokkos_ENABLE_MPI=ON
> > > -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_OPENMP=ON
> > > -DCMAKE_CXX_STANDARD="14" -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF
> > > -DKokkos_ENABLE_LIBDL=OFF -DKokkos_ENABLE_OPENMP=ON
> > > -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON']":
> > > -- Setting default Kokkos CXX standard to 14
> > > -- The CXX compiler identification is Fujitsu
> > > -- Check for working CXX compiler:
> > /opt/FJSVxtclanga/tcsds-1.2.31/bin/mpiFCC
> > > -- Check for working CXX compiler:
> > > /opt/FJSVxtclanga/tcsds-1.2.31/bin/mpiFCC -- works
> > > -- Detecting CXX compiler ABI info
> > > -- Detecting CXX compiler ABI info - done
> > > -- The project name is: Kokkos
> > > -- Configuring incomplete, errors occurred!
> > > See also
> > >
> > "/vol0004/ra010009/a04199/petsc/arch-arm/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeOutput.log".CMake
> > > Error at cmake/kokkos_compiler_id.cmake:129 (STRING):
> > >
> > > *  STRING sub-command REPLACE requires at least four arguments.*Call
> > Stack
> > > (most recent call first):
> > >   cmake/kokkos_tribits.cmake:174 (INCLUDE)
> > >   CMakeLists.txt:166 (KOKKOS_SETUP_BUILD_ENVIRONMENT)
> > >
> > > On Wed, Apr 14, 2021 at 3:34 PM Satish Balay <balay at mcs.anl.gov> wrote:
> > >
> > > > Additional kokkos cmake arguments can be passed in via
> > > > --download-kokkos-cmake-arguments=string option.
> > > >
> > > > Satish
> > > >
> > > > On Wed, 14 Apr 2021, Mark Adams wrote:
> > > >
> > > > > Satish,
> > > > >
> > > > > For the fujitsu compiler OMP is -Kopenmp.
> > > > >
> > > > > Sarat (cc'ed) tells me that he built Kokkos with:
> > > > >
> > > > > cmake -DCMAKE_BUILD_TYPE=Release \
> > > > >     -DCMAKE_INSTALL_PREFIX=${KOKKOS_SRC_DIR}/install \
> > > > >     -DBUILD_TESTING=OFF \
> > > > >     -DKokkos_ENABLE_LIBDL=OFF \
> > > > >     -DKokkos_ENABLE_OPENMP=ON \
> > > > >     -DKokkos_ENABLE_SERIAL=ON \
> > > > >     -DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON \
> > > > >     ..
> > > > >
> > > > > How might I make this happen in PETSc?
> > > > >
> > > > > Thanks,
> > > > > Mark
> > > > >
> > > > >
> > > > > On Wed, Apr 14, 2021 at 2:44 PM Satish Balay <balay at mcs.anl.gov>
> > wrote:
> > > > >
> > > > > > On Wed, 14 Apr 2021, Mark Adams wrote:
> > > > > >
> > > > > > > I have this building now.
> > > > > > > Do you know anything about OpenMP?
> > > > > > > I can add --with-openmp
> > > > > > > That should get Kokkos to be made with OpenMP.
> > > > > > > Should PETSc deal with the compilers correctly?
> > > > > >
> > > > > > Well it tries the following compiler options for openmp.
> > > > > >
> > > > > >     oflags = ["-fopenmp", # Gnu
> > > > > >               "-qsmp=omp",# IBM XL C/C++
> > > > > >               "-h omp",   # Cray. Must come after XL because XL
> > > > interprets
> > > > > > this option as meaning "-soname omp"
> > > > > >               "-mp",      # Portland Group
> > > > > >               "-Qopenmp", # Intel windows
> > > > > >               "-openmp",  # Intel
> > > > > >               "-xopenmp", # Sun
> > > > > >               "+Oopenmp", # HP
> > > > > >               "/openmp"   # Microsoft Visual Studio
> > > > > >               ]
> > > > > >
> > > > > > I don't know what the flag for fugaku compiler is.
> > > > > >
> > > > > > Satish
> > > > > >
> > > > > >
> > > > > > > Thanks,
> > > > > > > Mark
> > > > > > >
> > > > > > > On Wed, Apr 14, 2021 at 1:45 PM Mark Adams <mfadams at lbl.gov>
> > wrote:
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > If you feel inspired you could try Kokkos :||
> > > > > > > > I am in a parking lot waiting for my daughter but can try this
> > > > when I
> > > > > > get
> > > > > > > > home,
> > > > > > > > Thanks again,
> > > > > > > > Mark
> > > > > > > >
> > > > > > > > On Wed, Apr 14, 2021 at 1:33 PM Satish Balay <
> > balay at mcs.anl.gov>
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> I think I allocated a single node - and did the build on it.
> > > > > > > >>
> > > > > > > >> Now I'm getting an error - don't know what changed..
> > > > > > > >>
> > > > > > > >> login6$ pjsub --interact -L "node=1" -L
> > "rscunit=rscunit_ft01" -L
> > > > > > > >> "rscgrp=eap-int" -L "elapse=1:00:00" --sparam "wait-time=600"
> > > > > > > >> [ERR.] PJM 0059 pjsub rscgrp=eap-int is disabled.
> > > > > > > >>
> > > > > > > >> Ok - the following worked..
> > > > > > > >>
> > > > > > > >> login6$ pjsub --interact -L "node=1" -L
> > "rscunit=rscunit_ft01" -L
> > > > > > > >> "elapse=1:00:00" --sparam "wait-time=600"
> > > > > > > >> [INFO] PJM 0000 pjsub Job 6301572 submitted.
> > > > > > > >> [INFO] PJM 0081 .connected.
> > > > > > > >> [INFO] PJM 0082 pjsub Interactive job 6301572 started.
> > > > > > > >> [a04201 at j31-3110s petsc]$
> > > > > > > >>
> > > > > > > >> Ok - trying this build now.
> > > > > > > >>
> > > > > > > >> [a04201 at j31-3110s petsc]$ cat
> > > > > > > >> ~/petsc.save/arch-arm/lib/petsc/conf/reconfigure-arch-arm.py
> > > > > > > >> #!/usr/bin/python3
> > > > > > > >> if __name__ == '__main__':
> > > > > > > >>   import sys
> > > > > > > >>   import os
> > > > > > > >>   sys.path.insert(0, os.path.abspath('config'))
> > > > > > > >>   import configure
> > > > > > > >>   configure_options = [
> > > > > > > >>     '--with-blaslapack-lib=-lfjlapack',
> > > > > > > >>     '--with-debugging=0',
> > > > > > > >>     'CC=mpifcc',
> > > > > > > >>     'CXX=mpiFCC',
> > > > > > > >>     'FC=mpifrt',
> > > > > > > >>     'PETSC_ARCH=arch-arm',
> > > > > > > >>   ]
> > > > > > > >>   configure.petsc_configure(configure_options)
> > > > > > > >> [a04201 at j31-3110s petsc]$
> > > > > > > >> ~/petsc.save/arch-arm/lib/petsc/conf/reconfigure-arch-arm.py
> > > > > > > >> --download-p4est --download-zlib
> > > > > > > >> <snip>
> > > > > > > >> p4est:
> > > > > > > >>   Includes: -I/vol0004/ra010009/a04201/petsc/arch-arm/include
> > > > > > > >>   Library:
> > -Wl,-rpath,/vol0004/ra010009/a04201/petsc/arch-arm/lib
> > > > > > > >> -L/vol0004/ra010009/a04201/petsc/arch-arm/lib -lp4est -lsc
> > > > > > > >>
> > > > > > > >> Ok - this worked for me.
> > > > > > > >>
> > > > > > > >> Satish
> > > > > > > >>
> > > > > > > >> On Wed, 14 Apr 2021, Mark Adams wrote:
> > > > > > > >>
> > > > > > > >> > Do you recall what nodes you use to build on a "compute"
> > node,
> > > > to
> > > > > > avoid
> > > > > > > >> > cross compilation?
> > > > > > > >> >
> > > > > > > >> > On Wed, Apr 14, 2021 at 12:08 PM Satish Balay <
> > > > balay at mcs.anl.gov>
> > > > > > > >> wrote:
> > > > > > > >> >
> > > > > > > >> > > looks like p4est cannot be cross-compiled.
> > > > > > > >> > >
> > > > > > > >> > > Satish
> > > > > > > >> > >
> > > > > > > >> > > On Wed, 14 Apr 2021, Mark Adams wrote:
> > > > > > > >> > >
> > > > > > > >> > > > I get this error with p4est on Fugaku.
> > > > > > > >> > > > It is a Fortran error. Odd.
> > > > > > > >> > > > Mark
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
> >
> 



More information about the petsc-dev mailing list