[petsc-dev] Should v->valid_GPU_array be a bitmask?

Sun Oct 13 12:27:38 CDT 2019

  Yikes, forget about bit flags and names. 

  Does this behavior make sense? EVERY CUDA vector allocates memory on both GPU and CPU ? Or do I misunderstand the code?

   This seems fundamentally wrong and is different than before. What about the dozens of work vectors on the GPU (for example for Krylov methods)? There is no reason for them to have memory allocated on the CPU.  In the long run pretty much all the matrices and vectors will only reside on the GPU so this seems like a step backwards. Does libaxb do this? 

   Barry

> On Oct 1, 2019, at 10:24 PM, Zhang, Junchao via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
> 
> Stafano recently modified the following code,
> 
> 
> PetscErrorCode VecCreate_SeqCUDA(Vec V)
> 
> {
> 
>   PetscErrorCode ierr;
> 
> 
> 
>   PetscFunctionBegin;
> 
>   ierr = PetscLayoutSetUp(V->map);CHKERRQ(ierr);
> 
>   ierr = VecCUDAAllocateCheck(V);CHKERRQ(ierr);
> 
>   ierr = VecCreate_SeqCUDA_Private(V,((Vec_CUDA*)V->spptr)->GPUarray_allocated);CHKERRQ(ierr);
> 
>   ierr = VecCUDAAllocateCheckHost(V);CHKERRQ(ierr);
> 
>   ierr = VecSet(V,0.0);CHKERRQ(ierr);
> 
>   ierr = VecSet_Seq(V,0.0);CHKERRQ(ierr);
> 
>   V->valid_GPU_array = PETSC_OFFLOAD_BOTH;
> 
>   PetscFunctionReturn(0);
> 
> }
> 
> 
> 
> 
> That means if one creates an SEQCUDA vector V and then immediately tests if (V->valid_GPU_array
>  == PETSC_OFFLOAD_GPU), the test will fail. That is
> 
> counterintuitive.  I think we should have
> 
> 
> 
> 
> enum {PETSC_OFFLOAD_UNALLOCATED=0x0,PETSC_OFFLOAD_GPU=0x1,PETSC_OFFLOAD_CPU=0x2,PETSC_OFFLOAD_BOTH=0x3} 
> 
> 
> 
> 
> 
> and then use if (V->valid_GPU_array & PETSC_OFFLOAD_GPU). What do you think?
> 
> 
> 
> --Junchao Zhang