[petsc-users] Petsc + nvhpc

howen herbert.owen at bsc.es
Thu Nov 13 11:11:04 CST 2025


Dear Junchao,

Thank you for response and sorry for taking so long to answer back. 
I cannot avoid using the nvidia tools. Gfortran is not mature for OpenACC and gives us problems when compiling our code.
What I have done to enable using the latest petsc is to create my own C code to call petsc. 
I have little experience with c and it took me some time, but I can now use petsc 3.24.1  ;)

The behaviour remains the same as in my original email . 
Parallel+GPU gives bad results. CPU(serial and parallel) and GPU serial all work ok and give the same result.

I have gone a bit into petsc comparing the CPU and GPU version with 2 mpi.
I see that the difference starts in 
src/ksp/ksp/impls/cg/cg.c  L170
    PetscCall(KSP_PCApply(ksp, R, Z));  /*    z <- Br                           */
I have printed the vectors R and Z and the norm dp.
R is identical on both CPU and GPU; but Z differs.
The correct value of dp (for the first time it enters) is 14.3014, while running on the GPU with 2 mpis it gives 14.7493.
If you wish I can send you prints I introduced in cg.c    

The folder with the input files to run the case can be downloaded from https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAh7n_UO$ 

For submitting the gpu run I use 
mpirun -np 2 --map-by ppr:4:node:PE=20 --report-bindings ./mn5_bind.sh /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_gpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json

For the cpu run
mpirun -np 2 /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_cpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json

Our code can be downloaded with :
git clone --recursive https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEFjsBTIo$ 

-and the branch I am using with
git checkout 140-add-petsc

To use exactly the same commit I am using 
git checkout 09a923c9b57e46b14ae54b935845d50272691ace


I am currently using: Currently Loaded Modules:
  1) nvidia-hpc-sdk/25.1   2) hdf5/1.14.1-2-nvidia-nvhpcx   3) cmake/3.25.1
I guess/hope similar modules should be available in any supercomputer.

To build the cpu version 
mkdir build_cpu
cd build_cpu

export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241_cpu/hhinstal
export LD_LIBRARY_PATH=$PETSC_INSTALL/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=$PETSC_INSTALL/lib:$LIBRARY_PATH
export C_INCLUDE_PATH=$PETSC_INSTALL/include:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=$PETSC_INSTALL/include:$CPLUS_INCLUDE_PATH
export PKG_CONFIG_PATH=$PETSC_INSTALL/lib/pkgconfig:$PKG_CONFIG_PATH

cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=OFF -DDEBUG_MODE=OFF ..
make -j 80

I have built petsc myself  as follows

git clone -b release https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLELP8U6d0$  petsc
cd petsc
git checkout v3.24.1     
module purge
module load nvidia-hpc-sdk/25.1   hdf5/1.14.1-2-nvidia-nvhpcx cmake/3.25.1 
./configure --PETSC_DIR=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/petsc --prefix=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal --with-fortran-bindings=0  --with-fc=0 --with-petsc-arch=linux-x86_64-opt --with-scalar-type=real --with-debugging=yes --with-64-bit-indices=1 --with-precision=single --download-hypre CFLAGS=-I/apps/ACC/HDF5/1.14.1-2/NVIDIA/NVHPCX/include CXXFLAGS= FCFLAGS= --with-shared-libraries=1 --with-mpi=1 --with-blacs-lib=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/lib/intel64/libmkl_blacs_openmpi_lp64.a --with-blacs-include=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/include --with-mpi-dir=/apps/ACC/NVIDIA-HPC-SDK/25.1/Linux_x86_64/25.1/comm_libs/12.6/hpcx/latest/ompi/ --download-ptscotch=yes --download-metis --download-parmetis
make all check
make install

-------------------
For the GPU version when configuring petsc I add : --with-cuda 

I then change the export PETSC_INSTALL  to 
export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal
and repeat all other exports

mkdir build_gpu
cd build_gpu
cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=ON -DDEBUG_MODE=OFF ..
make -j 80

As you can see from the submit instructions the executable is found in sod2d_gitlab/build_gpu/src/app_sod2d/sod2d

I hope I have not forgotten anything and my instructions are 'easy' to follow. If you have any issue do not doubt to contact me.
The wiki for our code can be found in https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEA1vqPYk$ 

Best, 

Herbert Owen
 
Herbert Owen
Senior Researcher, Dpt. Computer Applications in Science and Engineering
Barcelona Supercomputing Center (BSC-CNS)
Tel: +34 93 413 4038
Skype: herbert.owen

https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$ 








> On 16 Oct 2025, at 18:30, Junchao Zhang <junchao.zhang at gmail.com> wrote:
> 
> Hi, Herbert,
>    I don't have much experience on OpenACC and PETSc CI doesn't have such tests.  Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code?  If you, then you can use the latest petsc code and make our debugging easier. 
>    Also, could you provide us with a test and instructions to reproduce the problem?
>    
>    Thanks!
> --Junchao Zhang
> 
> 
> On Thu, Oct 16, 2025 at 5:07 AM howen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> Dear All,
>> 
>> I am interfacing our CFD code (Fortran + OpenACC)  to Petsc. 
>> Since we use OpenACC the natural choice for us is to use Nvidia´s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler.  
>> 
>> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21.  
>> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. 
>> 
>> I would like to know, if you have experience with the Nvidia compiler.  I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated.
>> 
>> Best,
>> 
>> Herbert Owen
>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>> Barcelona Supercomputing Center (BSC-CNS)
>> Tel: +34 93 413 4038
>> Skype: herbert.owen
>> 
>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$  <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!abuM7ozzUs7eISYBumHNxpvO2Tuy74KRM4-WWcunXHZVjQf1V032xQrCzTfC5vA_NM-35xMEZ9yJ8XK-3QFqjWBSWuUi$>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/0c61aa44/attachment.html>


More information about the petsc-users mailing list