<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Yes, just checked, I only included the changes above the
      comment...</p>
    <p>Will test tomorrow, thanks for the help!</p>
    <p>Regards,</p>
    <p>Roland<br>
    </p>
    <div class="moz-cite-prefix">Am 20.12.2021 um 21:46 schrieb Barry
      Smith:<br>
    </div>
    <blockquote type="cite"
      cite="mid:9E99DFBE-BB73-481B-BBC5-517222080A15@petsc.dev">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <div class="">  Hmm, the fix should now be supplying a
        -DCMAKE_CUDA_FLAGS to this cmake command (line 48 at <a
          href="https://gitlab.com/petsc/petsc/-/merge_requests/4635/diffs"
          class="moz-txt-link-freetext" moz-do-not-send="true">https://gitlab.com/petsc/petsc/-/merge_requests/4635/diffs</a>)
        I do not see that flag being set inside the configure.log so I
        am guessing you didn't get the complete fix.</div>
      <div class=""><br class="">
      </div>
      <div class=""><br class="">
      </div>
      <div class="">Executing: /usr/bin/cmake ..
        -DCMAKE_INSTALL_PREFIX=/opt/petsc
        -DCMAKE_INSTALL_NAME_DIR:STRING="/opt/petsc/lib"
        -DCMAKE_INSTALL_LIBDIR:STRING="lib" -DCMAKE_VERBOSE_MAKEFILE=1
        -DCMAKE_BUILD_TYPE=Release
        -DCMAKE_C_COMPILER="/opt/intel/oneapi/mpi/2021.5.0/bin/mpicc"
        -DMPI_C_COMPILER="/opt/intel/oneapi/mpi/2021.5.0/bin/mpicc"
        -DCMAKE_AR=/usr/bin/ar -DCMAKE_RANLIB=/usr/bin/ranlib
        -DCMAKE_C_FLAGS:STRING="-mavx2 -march=native -O3 -fPIC -fopenmp"
        -DCMAKE_C_FLAGS_DEBUG:STRING="-mavx2 -march=native -O3 -fPIC
        -fopenmp" -DCMAKE_C_FLAGS_RELEASE:STRING="-mavx2 -march=native
        -O3 -fPIC -fopenmp"
        -DCMAKE_CXX_COMPILER="/opt/intel/oneapi/mpi/2021.5.0/bin/mpicxx"
        -DMPI_CXX_COMPILER="/opt/intel/oneapi/mpi/2021.5.0/bin/mpicxx"
        -DCMAKE_CXX_FLAGS:STRING="-mavx2 -march=native -O3 -fopenmp
        -fPIC -std=gnu++17 -fopenmp"
        -DCMAKE_CXX_FLAGS_DEBUG:STRING="-mavx2 -march=native -O3
        -fopenmp -fPIC -std=gnu++17 -fopenmp"
        -DCMAKE_CXX_FLAGS_RELEASE:STRING="-mavx2 -march=native -O3
        -fopenmp -fPIC -std=gnu++17 -fopenmp"
        -DCMAKE_Fortran_COMPILER="/opt/intel/oneapi/mpi/2021.5.0/bin/mpifc"
-DMPI_Fortran_COMPILER="/opt/intel/oneapi/mpi/2021.5.0/bin/mpifc"
        -DCMAKE_Fortran_FLAGS:STRING="-mavx2 -march=native -O3 -fPIC
        -fopenmp -fallow-argument-mismatch"
        -DCMAKE_Fortran_FLAGS_DEBUG:STRING="-mavx2 -march=native -O3
        -fPIC -fopenmp -fallow-argument-mismatch"
        -DCMAKE_Fortran_FLAGS_RELEASE:STRING="-mavx2 -march=native -O3
        -fPIC -fopenmp -fallow-argument-mismatch"
        -DCMAKE_EXE_LINKER_FLAGS:STRING=" -fopenmp -fopenmp"
        -DBUILD_SHARED_LIBS:BOOL=ON -DTPL_ENABLE_CUDALIB=TRUE
        -DTPL_CUDA_LIBRARIES="-Wl,-rpath,/usr/local/cuda-11.5/lib64
        -L/usr/local/cuda-11.5/lib64 -lcudart -lcufft -lcublas
        -lcusparse -lcusolver -lcurand
        -L/usr/local/cuda-11.5/lib64/stubs -lcuda"
        -DCUDA_ARCH_FLAGS="-I/usr/local/cuda-11.5/include -arch=sm_61
        -DDEBUGlevel=0 -DPRNTlevel=0" -DUSE_XSDK_DEFAULTS=YES
        -DTPL_BLAS_LIBRARIES="-lopenblas
        -Wl,-rpath,/usr/local/cuda-11.5/lib64
        -L/usr/local/cuda-11.5/lib64 -lcudart -lcufft -lcublas
        -lcusparse -lcusolver -lcurand
        -L/usr/local/cuda-11.5/lib64/stubs -lcuda -lm -lstdc++ -ldl
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib/release
        -L/opt/intel/oneapi/mpi/2021.5.0/lib/release
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib
        -L/opt/intel/oneapi/mpi/2021.5.0/lib -lmpifort -lmpi -lrt
        -lpthread -lgfortran -lm
        -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/11
        -L/usr/lib64/gcc/x86_64-suse-linux/11
        -Wl,-rpath,/opt/intel/oneapi/vpl/2022.0.0/lib
        -L/opt/intel/oneapi/vpl/2022.0.0/lib
        -Wl,-rpath,/opt/intel/oneapi/tbb/2021.5.0/lib/intel64/gcc4.8
        -L/opt/intel/oneapi/tbb/2021.5.0/lib/intel64/gcc4.8
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/libfabric/lib
        -L/opt/intel/oneapi/mpi/2021.5.0/libfabric/lib
        -Wl,-rpath,/opt/intel/oneapi/mkl/2022.0.1/lib/intel64
        -L/opt/intel/oneapi/mkl/2022.0.1/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/ipp/2021.5.1/lib/intel64
        -L/opt/intel/oneapi/ipp/2021.5.1/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/ippcp/2021.5.0/lib/intel64
        -L/opt/intel/oneapi/ippcp/2021.5.0/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/dnnl/2022.0.1/cpu_dpcpp_gpu_dpcpp/lib
        -L/opt/intel/oneapi/dnnl/2022.0.1/cpu_dpcpp_gpu_dpcpp/lib
        -Wl,-rpath,/opt/intel/oneapi/dal/2021.5.1/lib/intel64
        -L/opt/intel/oneapi/dal/2021.5.1/lib/intel64
-Wl,-rpath,/opt/intel/oneapi/compiler/2022.0.1/linux/compiler/lib/intel64_lin
-L/opt/intel/oneapi/compiler/2022.0.1/linux/compiler/lib/intel64_lin
        -Wl,-rpath,/opt/intel/oneapi/compiler/2022.0.1/linux/lib
        -L/opt/intel/oneapi/compiler/2022.0.1/linux/lib
        -Wl,-rpath,/opt/intel/oneapi/clck/2021.5.0/lib/intel64
        -L/opt/intel/oneapi/clck/2021.5.0/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/ccl/2021.5.0/lib/cpu_gpu_dpcpp
        -L/opt/intel/oneapi/ccl/2021.5.0/lib/cpu_gpu_dpcpp
        -Wl,-rpath,/usr/x86_64-suse-linux/lib
        -L/usr/x86_64-suse-linux/lib
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib/release
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib -lgfortran -lm
        -lgcc_s -lquadmath" -DTPL_LAPACK_LIBRARIES="-lopenblas
        -Wl,-rpath,/usr/local/cuda-11.5/lib64
        -L/usr/local/cuda-11.5/lib64 -lcudart -lcufft -lcublas
        -lcusparse -lcusolver -lcurand
        -L/usr/local/cuda-11.5/lib64/stubs -lcuda -lm -lstdc++ -ldl
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib/release
        -L/opt/intel/oneapi/mpi/2021.5.0/lib/release
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib
        -L/opt/intel/oneapi/mpi/2021.5.0/lib -lmpifort -lmpi -lrt
        -lpthread -lgfortran -lm
        -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/11
        -L/usr/lib64/gcc/x86_64-suse-linux/11
        -Wl,-rpath,/opt/intel/oneapi/vpl/2022.0.0/lib
        -L/opt/intel/oneapi/vpl/2022.0.0/lib
        -Wl,-rpath,/opt/intel/oneapi/tbb/2021.5.0/lib/intel64/gcc4.8
        -L/opt/intel/oneapi/tbb/2021.5.0/lib/intel64/gcc4.8
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/libfabric/lib
        -L/opt/intel/oneapi/mpi/2021.5.0/libfabric/lib
        -Wl,-rpath,/opt/intel/oneapi/mkl/2022.0.1/lib/intel64
        -L/opt/intel/oneapi/mkl/2022.0.1/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/ipp/2021.5.1/lib/intel64
        -L/opt/intel/oneapi/ipp/2021.5.1/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/ippcp/2021.5.0/lib/intel64
        -L/opt/intel/oneapi/ippcp/2021.5.0/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/dnnl/2022.0.1/cpu_dpcpp_gpu_dpcpp/lib
        -L/opt/intel/oneapi/dnnl/2022.0.1/cpu_dpcpp_gpu_dpcpp/lib
        -Wl,-rpath,/opt/intel/oneapi/dal/2021.5.1/lib/intel64
        -L/opt/intel/oneapi/dal/2021.5.1/lib/intel64
-Wl,-rpath,/opt/intel/oneapi/compiler/2022.0.1/linux/compiler/lib/intel64_lin
-L/opt/intel/oneapi/compiler/2022.0.1/linux/compiler/lib/intel64_lin
        -Wl,-rpath,/opt/intel/oneapi/compiler/2022.0.1/linux/lib
        -L/opt/intel/oneapi/compiler/2022.0.1/linux/lib
        -Wl,-rpath,/opt/intel/oneapi/clck/2021.5.0/lib/intel64
        -L/opt/intel/oneapi/clck/2021.5.0/lib/intel64
        -Wl,-rpath,/opt/intel/oneapi/ccl/2021.5.0/lib/cpu_gpu_dpcpp
        -L/opt/intel/oneapi/ccl/2021.5.0/lib/cpu_gpu_dpcpp
        -Wl,-rpath,/usr/x86_64-suse-linux/lib
        -L/usr/x86_64-suse-linux/lib
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib/release
        -Wl,-rpath,/opt/intel/oneapi/mpi/2021.5.0/lib -lgfortran -lm
        -lgcc_s -lquadmath" -Denable_parmetislib=FALSE
        -DTPL_ENABLE_PARMETISLIB=FALSE -DXSDK_ENABLE_Fortran=ON
        -Denable_tests=0 -Denable_examples=0
        -DMPI_C_COMPILE_FLAGS:STRING="" -DMPI_C_INCLUDE_PATH:STRING=""
        -DMPI_C_HEADER_DIR:STRING="" -DMPI_C_LIBRARIES:STRING=""</div>
      <div class=""><br class="">
      </div>
      <div class=""><br class="">
      </div>
      <div class=""><br class="">
      </div>
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">On Dec 20, 2021, at 2:59 PM, Roland Richter <<a
              href="mailto:roland.richter@ntnu.no"
              class="moz-txt-link-freetext" moz-do-not-send="true">roland.richter@ntnu.no</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <div class="">
              <p class="">I introduced the changes from that patch
                directly, without checking out. Is that insufficient?</p>
              <p class="">Regards,</p>
              <p class="">Roland<br class="">
              </p>
              <div class="moz-cite-prefix">Am 20.12.2021 um 20:38
                schrieb Barry Smith:<br class="">
              </div>
              <blockquote type="cite"
                cite="mid:2B1B2836-5107-430A-BDB8-E6165EFE6C65@petsc.dev"
                class="">
                <div class=""><br class="">
                </div>
                  Are you sure you have the correct PETSc branch? From
                configure.log it has
                <div class=""><br class="">
                </div>
                <div class="">
                  <div class="">            Defined "VERSION_GIT" to
                    ""v3.16.2-466-g959e1fce86""</div>
                  <div class="">            Defined "VERSION_DATE_GIT"
                    to ""2021-12-18 11:17:24 -0600""</div>
                  <div class="">            Defined "VERSION_BRANCH_GIT"
                    to ""master""</div>
                  <div class=""><br class="">
                  </div>
                  <div class="">It should have balay/slu-without-omp-3
                    for the branch.</div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><br class="">
                    <blockquote type="cite" class="">
                      <div class="">On Dec 20, 2021, at 10:50 AM, Roland
                        Richter <<a
                          href="mailto:roland.richter@ntnu.no"
                          class="moz-txt-link-freetext"
                          moz-do-not-send="true">roland.richter@ntnu.no</a>>
                        wrote:</div>
                      <br class="Apple-interchange-newline">
                      <div class="">
                        <div class="">
                          <p class="">In that case it fails with</p>
                          <p class=""><i class="">~/Downloads/git-files/petsc/mpich-complex-linux-gcc-demo/externalpackages/git.superlu_dist/SRC/cublas_utils.h:22:10:
                              fatal error: cublas_v2.h: No such file or
                              directory</i></p>
                          <p class="">even though this header is
                            available. I assume some header paths are
                            not set correctly?</p>
                          <p class="">Thanks,</p>
                          <p class="">regards,</p>
                          <p class="">Roland<br class="">
                          </p>
                          <div class="moz-cite-prefix">Am 20.12.21 um
                            16:29 schrieb Barry Smith:<br class="">
                          </div>
                          <blockquote type="cite"
                            cite="mid:2B3C240D-CEBB-4B45-A60E-ECD1F092B058@petsc.dev"
                            class="">
                            <div class=""><br class="">
                            </div>
                              Please try the
                            branch balay/slu-without-omp-3  It is in MR <a
href="https://gitlab.com/petsc/petsc/-/merge_requests/4635"
                              class="moz-txt-link-freetext"
                              moz-do-not-send="true">https://gitlab.com/petsc/petsc/-/merge_requests/4635</a>
                            <div class=""><br class="">
                            </div>
                            <div class=""><br class="">
                              <div class=""><br class="">
                                <blockquote type="cite" class="">
                                  <div class="">On Dec 20, 2021, at 8:14
                                    AM, Roland Richter <<a
                                      href="mailto:roland.richter@ntnu.no"
                                      class="moz-txt-link-freetext"
                                      moz-do-not-send="true">roland.richter@ntnu.no</a>>
                                    wrote:</div>
                                  <br class="Apple-interchange-newline">
                                  <div class="">
                                    <div class="">
                                      <p class="">Hei,</p>
                                      <p class="">I tried to combine
                                        CUDA with superlu_dist in petsc
                                        using the following
                                        configure-line:</p>
                                      <p class=""><i class="">./configure
PETSC_ARCH=mpich-complex-linux-gcc-demo
--CC=/opt/intel/oneapi/mpi/2021.5.0/bin/mpicc
                                          --CXX=/opt/intel/oneapi/mpi/2021.5.0/bin/mpicxx
--FC=/opt/intel/oneapi/mpi/2021.5.0/bin/mpifc --CFLAGS="-mavx2
                                          -march=native -O3"
                                          --CXXFLAGS="-mavx2
                                          -march=native -O3"
                                          --FFLAGS="-mavx2 -march=native
                                          -O3"
                                          --CUDAFLAGS=-allow-unsupported-compiler
                                          --CUDA-CXX=g++
                                          --prefix=/opt/petsc
                                          --with-blaslapack=1
                                          --with-mpi=1
                                          --with-scalar-type=complex
                                          --download-suitesparse=1
                                          --with-cuda --with-debugging=0
                                          --with-openmp
                                          --download-superlu_dist
                                          --force</i></p>
                                      <p class="">but the configure-step
                                        fails with several errors
                                        correlated with CUDA and
                                        superlu_dist, the first one
                                        being</p>
                                      <p class=""><i class="">cublas_utils.c:21:37:
                                          error: ‘CUDART_VERSION’
                                          undeclared (first use in this
                                          function); did you mean
                                          ‘CUDA_VERSION’?</i><i class=""><br
                                            class="">
                                        </i><i class="">   21 |    
                                          printf("CUDA version:   v
                                          %d\n",CUDART_VERSION);</i><i
                                          class=""><br class="">
                                        </i><i class="">     
                                          |                                    
                                          ^~~~~~~~~~~~~~</i><i class=""><br
                                            class="">
                                        </i><i class="">     
                                          |                                    
                                          CUDA_VERSION</i></p>
                                      <p class="">Compiling superlu_dist
                                        separately works, though
                                        (including CUDA).</p>
                                      <p class="">Is there a bug
                                        somewhere in the
                                        configure-routine? I attached
                                        the full configure-log.<br
                                          class="">
                                      </p>
                                      <p class="">Thanks!</p>
                                      <p class="">Regards,</p>
                                      <p class="">Roland<br class="">
                                      </p>
                                    </div>
                                    <span
                                      id="cid:F1E9B5DB-A92E-4C95-83FD-3F462C695FE9"
                                      class=""><configure.log></span></div>
                                </blockquote>
                              </div>
                              <br class="">
                            </div>
                          </blockquote>
                        </div>
                        <span
                          id="cid:4A36F88D-58F8-40D3-A642-41C361B045C8"
                          class=""><configure.log></span></div>
                    </blockquote>
                  </div>
                  <br class="">
                </div>
              </blockquote>
            </div>
          </div>
        </blockquote>
      </div>
      <br class="">
    </blockquote>
  </body>
</html>