[petsc-dev] Should v->valid_GPU_array be a bitmask?

Smith, Barry F. bsmith at mcs.anl.gov
Sun Oct 13 19:50:19 CDT 2019


  I'd like to see both of them allocated on demand.



> On Oct 13, 2019, at 6:56 PM, Zhang, Junchao <jczhang at mcs.anl.gov> wrote:
> 
> I had an MR (already merged to master) that changed the name to v->offloadmask.
> But the behavior is not changed. VecCreate_SeqCUDA still allocates on both CPU and GPU. I believe we should allocate on CPU on-demand for VecCUDA.
> 
> --Junchao Zhang
> 
> 
> On Sun, Oct 13, 2019 at 12:27 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
>   Yikes, forget about bit flags and names. 
> 
>   Does this behavior make sense? EVERY CUDA vector allocates memory on both GPU and CPU ? Or do I misunderstand the code?
> 
>    This seems fundamentally wrong and is different than before. What about the dozens of work vectors on the GPU (for example for Krylov methods)? There is no reason for them to have memory allocated on the CPU.  In the long run pretty much all the matrices and vectors will only reside on the GPU so this seems like a step backwards. Does libaxb do this? 
> 
> 
>    Barry
> 
> 
> 
> 
> 
> > On Oct 1, 2019, at 10:24 PM, Zhang, Junchao via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
> > 
> > Stafano recently modified the following code,
> > 
> > 
> > PetscErrorCode VecCreate_SeqCUDA(Vec V)
> > 
> > {
> > 
> >   PetscErrorCode ierr;
> > 
> > 
> > 
> >   PetscFunctionBegin;
> > 
> >   ierr = PetscLayoutSetUp(V->map);CHKERRQ(ierr);
> > 
> >   ierr = VecCUDAAllocateCheck(V);CHKERRQ(ierr);
> > 
> >   ierr = VecCreate_SeqCUDA_Private(V,((Vec_CUDA*)V->spptr)->GPUarray_allocated);CHKERRQ(ierr);
> > 
> >   ierr = VecCUDAAllocateCheckHost(V);CHKERRQ(ierr);
> > 
> >   ierr = VecSet(V,0.0);CHKERRQ(ierr);
> > 
> >   ierr = VecSet_Seq(V,0.0);CHKERRQ(ierr);
> > 
> >   V->valid_GPU_array = PETSC_OFFLOAD_BOTH;
> > 
> >   PetscFunctionReturn(0);
> > 
> > }
> > 
> > 
> > 
> > 
> > That means if one creates an SEQCUDA vector V and then immediately tests if (V->valid_GPU_array
> >  == PETSC_OFFLOAD_GPU), the test will fail. That is
> > 
> > counterintuitive.  I think we should have
> > 
> > 
> > 
> > 
> > enum {PETSC_OFFLOAD_UNALLOCATED=0x0,PETSC_OFFLOAD_GPU=0x1,PETSC_OFFLOAD_CPU=0x2,PETSC_OFFLOAD_BOTH=0x3} 
> > 
> > 
> > 
> > 
> > 
> > and then use if (V->valid_GPU_array & PETSC_OFFLOAD_GPU). What do you think?
> > 
> > 
> > 
> > --Junchao Zhang
> 



More information about the petsc-dev mailing list