[petsc-users] NVIDIA HPC SDK and complex data type

Matthew Knepley knepley at gmail.com
Sat Dec 18 19:22:33 CST 2021


On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> I found it is a NVIDIA C/C++ compiler bug.  I can reproduce it with
>

Great find!

  Matt


> #include <stdlib.h>
> #include <stdio.h>
> #include <complex.h>
>
> typedef double _Complex PetscScalar;
> typedef struct {
> int row;
> PetscScalar *valaddr;
> } MatEntry2;
>
> int main(int arc, char** argv)
> {
> int i=2;
> MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2));
> PetscScalar a=1, b=1;
>
> printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2));
> Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar));
> *(Jentry2[i].valaddr) = a*b; // Segfault
>
> free(Jentry2[2].valaddr);
> free(Jentry2);
> return 0;
> }
>
> $ nvc -O0 -o test test.c
> $ ./test
> sizeof(MatEntry2)=16
> Segmentation fault (core dumped)
>
> If I change *(Jentry2[i].valaddr) = a*b; to
>
> PetscScalar *p = Jentry2[2].valaddr;
> *p = a*b;
>
> Then the code works fine.  Using -O0 to -O2 will also avoid this error for
> this simple test, but not for PETSc.  In PETSc, I could apply the above
> silly trick, but I am not sure it is worth it. We should instead report it
> to NVIDIA.
>
> Looking at the assembly code for the segfault line,  we can find the
> problem
>   movslq  52(%rsp), %rcx
>   movq  40(%rsp), %rax
>   movq  8(%rax,%rcx,8), %rax   //  Here %rax = &Jentry2, %rcx = i;  The
> instruction wrongly calculates Jentry2[2].valaddr as  (%rax + %rcx*8)+8,
> which should instead be (%rax + %rcx*16)+8
>   vmovsd  %xmm1, 8(%rax)
>   vmovsd  %xmm0, (%rax)
>
> --Junchao Zhang
>
>
> On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Hi, Jon,
>>   I could reproduce the error exactly.  I will have a look.
>>   Thanks for reporting.
>> --Junchao Zhang
>>
>>
>> On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <
>> halverson at princeton.edu> wrote:
>>
>>> Hello,
>>>
>>> We are unable to build PETSc using the NVIDIA HPC SDK and
>>> --with-scalar-type=complex. Below is our procedure:
>>>
>>> $ module load nvhpc/21.11
>>>
>>> $ module load openmpi/nvhpc-21.11/4.1.2/64
>>> $ git clone -b release https://gitlab.com/petsc/petsc.git petsc; cd
>>> petsc
>>>
>>> $ ./configure --with-debugging=1 --with-scalar-type=complex
>>> PETSC_ARCH=openmpi-power
>>>
>>> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all
>>>
>>> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power
>>> check
>>>
>>> "make check" fails with a segmentation fault when running ex19. The
>>> fortran test ex5f passes.
>>>
>>> The procedure above fails on x86_64 and POWER both running RHEL8. It
>>> also fails using nvhpc 20.7.
>>>
>>> The procedure above works for "real" instead of "complex".
>>>
>>> A "hello world" MPI code using a complex data type works with our nvhpc
>>> modules.
>>>
>>> The procedure above works successfully when GCC and an Open MPI library
>>> built using GCC is used.
>>>
>>> The only trouble is the combination of PETSc with nvhpc and complex. Any
>>> known issues?
>>>
>>> The build log for the procedure above is here:
>>> https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log
>>>
>>> Jon
>>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211218/aa37fc65/attachment.html>


More information about the petsc-users mailing list