[petsc-users] NVIDIA HPC SDK and complex data type
Barry Smith
bsmith at petsc.dev
Sat Dec 18 19:51:09 CST 2021
Yes, Junchao deserves a bounty from NVIDIA for this find.
> On Dec 18, 2021, at 8:22 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
> I found it is a NVIDIA C/C++ compiler bug. I can reproduce it with
>
> Great find!
>
> Matt
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <complex.h>
>
> typedef double _Complex PetscScalar;
> typedef struct {
> int row;
> PetscScalar *valaddr;
> } MatEntry2;
>
> int main(int arc, char** argv)
> {
> int i=2;
> MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2));
> PetscScalar a=1, b=1;
>
> printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2));
> Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar));
> *(Jentry2[i].valaddr) = a*b; // Segfault
>
> free(Jentry2[2].valaddr);
> free(Jentry2);
> return 0;
> }
>
> $ nvc -O0 -o test test.c
> $ ./test
> sizeof(MatEntry2)=16
> Segmentation fault (core dumped)
>
> If I change *(Jentry2[i].valaddr) = a*b; to
>
> PetscScalar *p = Jentry2[2].valaddr;
> *p = a*b;
>
> Then the code works fine. Using -O0 to -O2 will also avoid this error for this simple test, but not for PETSc. In PETSc, I could apply the above silly trick, but I am not sure it is worth it. We should instead report it to NVIDIA.
>
> Looking at the assembly code for the segfault line, we can find the problem
> movslq 52(%rsp), %rcx
> movq 40(%rsp), %rax
> movq 8(%rax,%rcx,8), %rax // Here %rax = &Jentry2, %rcx = i; The instruction wrongly calculates Jentry2[2].valaddr as (%rax + %rcx*8)+8, which should instead be (%rax + %rcx*16)+8
> vmovsd %xmm1, 8(%rax)
> vmovsd %xmm0, (%rax)
>
> --Junchao Zhang
>
>
> On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
> Hi, Jon,
> I could reproduce the error exactly. I will have a look.
> Thanks for reporting.
> --Junchao Zhang
>
>
> On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <halverson at princeton.edu <mailto:halverson at princeton.edu>> wrote:
> Hello,
>
> We are unable to build PETSc using the NVIDIA HPC SDK and --with-scalar-type=complex. Below is our procedure:
>
> $ module load nvhpc/21.11
> $ module load openmpi/nvhpc-21.11/4.1.2/64
> $ git clone -b release https://gitlab.com/petsc/petsc.git <https://gitlab.com/petsc/petsc.git> petsc; cd petsc
> $ ./configure --with-debugging=1 --with-scalar-type=complex PETSC_ARCH=openmpi-power
> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all
> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check
>
> "make check" fails with a segmentation fault when running ex19. The fortran test ex5f passes.
>
> The procedure above fails on x86_64 and POWER both running RHEL8. It also fails using nvhpc 20.7.
>
> The procedure above works for "real" instead of "complex".
>
> A "hello world" MPI code using a complex data type works with our nvhpc modules.
>
> The procedure above works successfully when GCC and an Open MPI library built using GCC is used.
>
> The only trouble is the combination of PETSc with nvhpc and complex. Any known issues?
>
> The build log for the procedure above is here:
> https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log <https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log>
>
> Jon
>
>
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211218/2119b742/attachment-0001.html>
More information about the petsc-users
mailing list