[petsc-users] NVIDIA HPC SDK and complex data type

Jonathan D. Halverson halverson at Princeton.EDU
Mon Dec 20 08:19:47 CST 2021


Hi Junchao,

Thank you very much for your quick work. The simple build procedure now works.

Jon
________________________________
From: Junchao Zhang <junchao.zhang at gmail.com>
Sent: Sunday, December 19, 2021 6:38 PM
To: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Cc: Jonathan D. Halverson <halverson at Princeton.EDU>
Subject: Re: [petsc-users] NVIDIA HPC SDK and complex data type

Since it will take a while for NVIDIA to fix the bug in their NVCHPC 21.11 (December 2021), I added a workaround to the MR in petsc, https://gitlab.com/petsc/petsc/-/merge_requests/4663
I tested it and it works with debugging (-O0) or no debugging (-O, or -O2).

--Junchao Zhang


On Sat, Dec 18, 2021 at 7:51 PM Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>> wrote:

  Yes, Junchao deserves a bounty from NVIDIA for this find.

On Dec 18, 2021, at 8:22 PM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang <junchao.zhang at gmail.com<mailto:junchao.zhang at gmail.com>> wrote:
I found it is a NVIDIA C/C++ compiler bug.  I can reproduce it with

Great find!

  Matt

#include <stdlib.h>
#include <stdio.h>
#include <complex.h>

typedef double _Complex PetscScalar;
typedef struct {
int row;
PetscScalar *valaddr;
} MatEntry2;

int main(int arc, char** argv)
{
int i=2;
MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2));
PetscScalar a=1, b=1;

printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2));
Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar));
*(Jentry2[i].valaddr) = a*b; // Segfault

free(Jentry2[2].valaddr);
free(Jentry2);
return 0;
}

$ nvc -O0 -o test test.c
$ ./test
sizeof(MatEntry2)=16
Segmentation fault (core dumped)

If I change *(Jentry2[i].valaddr) = a*b; to

PetscScalar *p = Jentry2[2].valaddr;
*p = a*b;

Then the code works fine.  Using -O0 to -O2 will also avoid this error for this simple test, but not for PETSc.  In PETSc, I could apply the above silly trick, but I am not sure it is worth it. We should instead report it to NVIDIA.

Looking at the assembly code for the segfault line,  we can find the problem
  movslq  52(%rsp), %rcx
  movq  40(%rsp), %rax
  movq  8(%rax,%rcx,8), %rax   //  Here %rax = &Jentry2, %rcx = i;  The instruction wrongly calculates Jentry2[2].valaddr as  (%rax + %rcx*8)+8,  which should instead be (%rax + %rcx*16)+8
  vmovsd  %xmm1, 8(%rax)
  vmovsd  %xmm0, (%rax)

--Junchao Zhang


On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang <junchao.zhang at gmail.com<mailto:junchao.zhang at gmail.com>> wrote:
Hi, Jon,
  I could reproduce the error exactly.  I will have a look.
  Thanks for reporting.
--Junchao Zhang


On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson <halverson at princeton.edu<mailto:halverson at princeton.edu>> wrote:
Hello,

We are unable to build PETSc using the NVIDIA HPC SDK and --with-scalar-type=complex. Below is our procedure:

$ module load nvhpc/21.11
$ module load openmpi/nvhpc-21.11/4.1.2/64
$ git clone -b release https://gitlab.com/petsc/petsc.git petsc; cd petsc
$ ./configure --with-debugging=1 --with-scalar-type=complex PETSC_ARCH=openmpi-power
$ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all
$ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check

"make check" fails with a segmentation fault when running ex19. The fortran test ex5f passes.

The procedure above fails on x86_64 and POWER both running RHEL8. It also fails using nvhpc 20.7.

The procedure above works for "real" instead of "complex".

A "hello world" MPI code using a complex data type works with our nvhpc modules.

The procedure above works successfully when GCC and an Open MPI library built using GCC is used.

The only trouble is the combination of PETSc with nvhpc and complex. Any known issues?

The build log for the procedure above is here:
https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log

Jon


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211220/bccc4edc/attachment.html>


More information about the petsc-users mailing list