[petsc-users] NVIDIA HPC SDK and complex data type

Junchao Zhang junchao.zhang at gmail.com
Sat Dec 18 19:02:47 CST 2021


I found it is a NVIDIA C/C++ compiler bug.  I can reproduce it with

#include <stdlib.h>
#include <stdio.h>
#include <complex.h>

typedef double _Complex PetscScalar;
typedef struct {
int row;
PetscScalar *valaddr;
} MatEntry2;

int main(int arc, char** argv)
{
int i=2;
MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2));
PetscScalar a=1, b=1;

printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2));
Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar));
*(Jentry2[i].valaddr) = a*b; // Segfault

free(Jentry2[2].valaddr);
free(Jentry2);
return 0;
}

$ nvc -O0 -o test test.c
$ ./test
sizeof(MatEntry2)=16
Segmentation fault (core dumped)

If I change *(Jentry2[i].valaddr) = a*b; to

PetscScalar *p = Jentry2[2].valaddr;
*p = a*b;

Then the code works fine.  Using -O0 to -O2 will also avoid this error for
this simple test, but not for PETSc.  In PETSc, I could apply the above
silly trick, but I am not sure it is worth it. We should instead report it
to NVIDIA.

Looking at the assembly code for the segfault line,  we can find the problem
  movslq  52(%rsp), %rcx
  movq  40(%rsp), %rax
  movq  8(%rax,%rcx,8), %rax   //  Here %rax = &Jentry2, %rcx = i;  The
instruction wrongly calculates Jentry2[2].valaddr as  (%rax + %rcx*8)+8,
which should instead be (%rax + %rcx*16)+8
  vmovsd  %xmm1, 8(%rax)
  vmovsd  %xmm0, (%rax)

--Junchao Zhang


On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> Hi, Jon,
>   I could reproduce the error exactly.  I will have a look.
>   Thanks for reporting.
> --Junchao Zhang
>
>
> On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <
> halverson at princeton.edu> wrote:
>
>> Hello,
>>
>> We are unable to build PETSc using the NVIDIA HPC SDK and
>> --with-scalar-type=complex. Below is our procedure:
>>
>> $ module load nvhpc/21.11
>>
>> $ module load openmpi/nvhpc-21.11/4.1.2/64
>> $ git clone -b release https://gitlab.com/petsc/petsc.git petsc; cd petsc
>>
>> $ ./configure --with-debugging=1 --with-scalar-type=complex
>> PETSC_ARCH=openmpi-power
>>
>> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all
>>
>> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check
>>
>> "make check" fails with a segmentation fault when running ex19. The
>> fortran test ex5f passes.
>>
>> The procedure above fails on x86_64 and POWER both running RHEL8. It also
>> fails using nvhpc 20.7.
>>
>> The procedure above works for "real" instead of "complex".
>>
>> A "hello world" MPI code using a complex data type works with our nvhpc
>> modules.
>>
>> The procedure above works successfully when GCC and an Open MPI library
>> built using GCC is used.
>>
>> The only trouble is the combination of PETSc with nvhpc and complex. Any
>> known issues?
>>
>> The build log for the procedure above is here:
>> https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log
>>
>> Jon
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211218/302dbafd/attachment-0001.html>


More information about the petsc-users mailing list