[petsc-dev] [petsc-maint] Installing with CUDA on a cluster

Sat Mar 10 10:33:26 CST 2018

Karl,

[forwarding this discussion to petsc-dev]

* My fixes should alleviate some of the CUSP installation issues. I
  don't know enough about CUSP interface wrt the useful features vs
  other burderns - and if its good to drop it or not. [If needed - we
  can add in more version dependencies in configure]

* Wrt CUDA - currently my test is with CUDA-7.5. I can try migrating a
  couple of tests to CUDA-9.1 [on frog]. But what about older
  releases?  Any reason we should drop them? I.e any reason to up the
  following values?

    self.CUDAMinVersion   = '5000' # Minimal cuda version is 5.0
    self.CUSPMinVersion  = '400' # Minimal cusp version is 0.4

  We do change it for complex build [we don't have a test for this case]

    if self.defaultScalarType.lower() == 'complex': self.CUDAMinVersion = '7050'

* Our test GUP is M2090 - with Compute capability (version) 2.0.
  CUDA-7.5 works on it. CUDA-8 gives deprecated warnings. CUDA-9 does
  not work? So what do we do for such old hardware? Do we keep
  CUDA-7.5 is the minimum supported version for extended time? [At
  some point we could switch to minimum version CUDA-8 - if we can get
  rid of the warnings]

* BTW: Wrt --with-cuda-arch, I'm hoping we can get rid of it in favor
  of CUDAFLAGS [with defaults similar to CFLAGS defaults] - but its
  not clear if I can easily untangle the dependencies we have [wrt CPP
  - and others]

  Or can we get rid of this default alltogether [currently
  -arch=sm_20] - and expect nvcc to have sane defaults? Then we can
  probably eliminate all this complicated code. [If cuda-7.5 and higer
  do this properly - we could use that as the minimum supported version?]

Satish

On Sat, 10 Mar 2018, Karl Rupp wrote:

> Hi all,
> 
> a couple of notes here, particularly for Manuel:
> 
>  * CUSP is repeatedly causing such installation problems, hence we will soon
> drop it as a vector backend and instead only provide a native
> CUBLAS/CUSPARSE-based backend.
> 
>  * you can use this native CUDA backend already now. Just configure with only
> --with-cuda=1 --with-cuda-arch=sm_60 (sm_30 should also work and is compatible
> with Tesla K20 GPUs you may find on other clusters).
> 
>  * The multigrid preconditioner from CUSP is selected via
>    -pc_type sa_cusp
>    Make sure you also use -vec_type cusp -mat_type aijcusp
>    If you don't need the multigrid preconditioner from CUSP, please 
> just reconfigure and use the native CUDA-backend with -vec_type cuda -mat_type
> aijcusparse
> 
>  * Right now only one of {native CUDA, CUSP, ViennaCL} can be activated at
> configure time. This will be fixed later this month.
> 
> If you're looking for a GPU-accelerated multigrid preconditioner: I just heard
> yesterday that NVIDIA's AMGX is now open source. I'll provide a wrapper within
> PETSc soon.
> 
> As Matt already said: Don't expect much more than a modest speedup over your
> existing CPU-based code - provided that your setup is GPU-friendly and your
> problem size is appropriate.
> 
> Best regards,
> Karli
> 
> 
> 
> 
> On 03/10/2018 03:38 AM, Satish Balay wrote:
> > I've updated configure so that --download-cusp gets the
> > correct/compatible cusp version - for cuda 7,8 vs 9
> > 
> > The changes are in branch balay/cuda-cusp-cleanup - and merged to next.
> > 
> > Satish
> > 
> > On Wed, 7 Mar 2018, Satish Balay wrote:
> > 
> >> --download-cusp gets hardly ever used so likely broken.
> >>
> >> It needs to be updated to somehow use the correct cusp version based
> >> on the cuda version thats being used.
> >>
> >> [and since we can't easily check for cusp compatibility - we should
> >> probably
> >> remove checkCUSPVersion() code]
> >>
> >> When using Cuda-9 - you can try options:
> >>
> >> --download-cusp=1 --download-cusp-commit=116b090
> >>
> 
>