[petsc-users] NVIDIA HPC SDK and complex data type

Junchao Zhang junchao.zhang at gmail.com
Thu Feb 24 15:00:59 CST 2022


FYI:  I am notified that the nvc compiler bug was fixed in nvhpc 22.2-0

--Junchao Zhang

On Mon, Dec 20, 2021 at 8:19 AM Jonathan D. Halverson <
halverson at princeton.edu> wrote:

> Hi Junchao,
>
> Thank you very much for your quick work. The simple build procedure now
> works.
>
> Jon
> ------------------------------
> *From:* Junchao Zhang <junchao.zhang at gmail.com>
> *Sent:* Sunday, December 19, 2021 6:38 PM
> *To:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Cc:* Jonathan D. Halverson <halverson at Princeton.EDU>
> *Subject:* Re: [petsc-users] NVIDIA HPC SDK and complex data type
>
> Since it will take a while for NVIDIA to fix the bug in their NVCHPC 21.11
> (December 2021), I added a workaround to the MR in petsc,
> https://gitlab.com/petsc/petsc/-/merge_requests/4663
> I tested it and it works with debugging (-O0) or no debugging (-O, or
> -O2).
>
> --Junchao Zhang
>
>
> On Sat, Dec 18, 2021 at 7:51 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>
>   Yes, Junchao deserves a bounty from NVIDIA for this find.
>
> On Dec 18, 2021, at 8:22 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
> I found it is a NVIDIA C/C++ compiler bug.  I can reproduce it with
>
>
> Great find!
>
>   Matt
>
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <complex.h>
>
> typedef double _Complex PetscScalar;
> typedef struct {
> int row;
> PetscScalar *valaddr;
> } MatEntry2;
>
> int main(int arc, char** argv)
> {
> int i=2;
> MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2));
> PetscScalar a=1, b=1;
>
> printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2));
> Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar));
> *(Jentry2[i].valaddr) = a*b; // Segfault
>
> free(Jentry2[2].valaddr);
> free(Jentry2);
> return 0;
> }
>
> $ nvc -O0 -o test test.c
> $ ./test
> sizeof(MatEntry2)=16
> Segmentation fault (core dumped)
>
> If I change *(Jentry2[i].valaddr) = a*b; to
>
> PetscScalar *p = Jentry2[2].valaddr;
> *p = a*b;
>
> Then the code works fine.  Using -O0 to -O2 will also avoid this error for
> this simple test, but not for PETSc.  In PETSc, I could apply the above
> silly trick, but I am not sure it is worth it. We should instead report it
> to NVIDIA.
>
> Looking at the assembly code for the segfault line,  we can find the
> problem
>   movslq  52(%rsp), %rcx
>   movq  40(%rsp), %rax
>   movq  8(%rax,%rcx,8), %rax   //  Here %rax = &Jentry2, %rcx = i;  The
> instruction wrongly calculates Jentry2[2].valaddr as  (%rax + %rcx*8)+8,
> which should instead be (%rax + %rcx*16)+8
>   vmovsd  %xmm1, 8(%rax)
>   vmovsd  %xmm0, (%rax)
>
> --Junchao Zhang
>
>
> On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
> Hi, Jon,
>   I could reproduce the error exactly.  I will have a look.
>   Thanks for reporting.
> --Junchao Zhang
>
>
> On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <
> halverson at princeton.edu> wrote:
>
> Hello,
>
> We are unable to build PETSc using the NVIDIA HPC SDK and
> --with-scalar-type=complex. Below is our procedure:
>
> $ module load nvhpc/21.11
> $ module load openmpi/nvhpc-21.11/4.1.2/64
> $ git clone -b release https://gitlab.com/petsc/petsc.git petsc; cd petsc
> $ ./configure --with-debugging=1 --with-scalar-type=complex
> PETSC_ARCH=openmpi-power
> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all
> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check
>
> "make check" fails with a segmentation fault when running ex19. The
> fortran test ex5f passes.
>
> The procedure above fails on x86_64 and POWER both running RHEL8. It also
> fails using nvhpc 20.7.
>
> The procedure above works for "real" instead of "complex".
>
> A "hello world" MPI code using a complex data type works with our nvhpc
> modules.
>
> The procedure above works successfully when GCC and an Open MPI library
> built using GCC is used.
>
> The only trouble is the combination of PETSc with nvhpc and complex. Any
> known issues?
>
> The build log for the procedure above is here:
> https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log
>
> Jon
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220224/18ffa9e7/attachment.html>


More information about the petsc-users mailing list