[petsc-users] NVIDIA HPC SDK and complex data type

Junchao Zhang junchao.zhang at gmail.com
Sun Dec 19 17:38:16 CST 2021


Since it will take a while for NVIDIA to fix the bug in their NVCHPC 21.11
(December 2021), I added a workaround to the MR in petsc,
https://gitlab.com/petsc/petsc/-/merge_requests/4663
I tested it and it works with debugging (-O0) or no debugging (-O, or
-O2).

--Junchao Zhang


On Sat, Dec 18, 2021 at 7:51 PM Barry Smith <bsmith at petsc.dev> wrote:

>
>   Yes, Junchao deserves a bounty from NVIDIA for this find.
>
> On Dec 18, 2021, at 8:22 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> I found it is a NVIDIA C/C++ compiler bug.  I can reproduce it with
>>
>
> Great find!
>
>   Matt
>
>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <complex.h>
>>
>> typedef double _Complex PetscScalar;
>> typedef struct {
>> int row;
>> PetscScalar *valaddr;
>> } MatEntry2;
>>
>> int main(int arc, char** argv)
>> {
>> int i=2;
>> MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2));
>> PetscScalar a=1, b=1;
>>
>> printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2));
>> Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar));
>> *(Jentry2[i].valaddr) = a*b; // Segfault
>>
>> free(Jentry2[2].valaddr);
>> free(Jentry2);
>> return 0;
>> }
>>
>> $ nvc -O0 -o test test.c
>> $ ./test
>> sizeof(MatEntry2)=16
>> Segmentation fault (core dumped)
>>
>> If I change *(Jentry2[i].valaddr) = a*b; to
>>
>> PetscScalar *p = Jentry2[2].valaddr;
>> *p = a*b;
>>
>> Then the code works fine.  Using -O0 to -O2 will also avoid this error
>> for this simple test, but not for PETSc.  In PETSc, I could apply the above
>> silly trick, but I am not sure it is worth it. We should instead report it
>> to NVIDIA.
>>
>> Looking at the assembly code for the segfault line,  we can find the
>> problem
>>   movslq  52(%rsp), %rcx
>>   movq  40(%rsp), %rax
>>   movq  8(%rax,%rcx,8), %rax   //  Here %rax = &Jentry2, %rcx = i;  The
>> instruction wrongly calculates Jentry2[2].valaddr as  (%rax +
>> %rcx*8)+8,  which should instead be (%rax + %rcx*16)+8
>>   vmovsd  %xmm1, 8(%rax)
>>   vmovsd  %xmm0, (%rax)
>>
>> --Junchao Zhang
>>
>>
>> On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> Hi, Jon,
>>>   I could reproduce the error exactly.  I will have a look.
>>>   Thanks for reporting.
>>> --Junchao Zhang
>>>
>>>
>>> On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <
>>> halverson at princeton.edu> wrote:
>>>
>>>> Hello,
>>>>
>>>> We are unable to build PETSc using the NVIDIA HPC SDK and
>>>> --with-scalar-type=complex. Below is our procedure:
>>>>
>>>> $ module load nvhpc/21.11
>>>> $ module load openmpi/nvhpc-21.11/4.1.2/64
>>>> $ git clone -b release https://gitlab.com/petsc/petsc.git petsc; cd
>>>> petsc
>>>> $ ./configure --with-debugging=1 --with-scalar-type=complex
>>>> PETSC_ARCH=openmpi-power
>>>> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all
>>>> $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power
>>>> check
>>>>
>>>> "make check" fails with a segmentation fault when running ex19. The
>>>> fortran test ex5f passes.
>>>>
>>>> The procedure above fails on x86_64 and POWER both running RHEL8. It
>>>> also fails using nvhpc 20.7.
>>>>
>>>> The procedure above works for "real" instead of "complex".
>>>>
>>>> A "hello world" MPI code using a complex data type works with our nvhpc
>>>> modules.
>>>>
>>>> The procedure above works successfully when GCC and an Open MPI library
>>>> built using GCC is used.
>>>>
>>>> The only trouble is the combination of PETSc with nvhpc and complex.
>>>> Any known issues?
>>>>
>>>> The build log for the procedure above is here:
>>>>
>>>> https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log
>>>>
>>>> Jon
>>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211219/fdb5291d/attachment.html>


More information about the petsc-users mailing list