[petsc-users] MemCpy (HtoD and DtoH) in Krylov solver

Wed Jul 17 06:36:55 CDT 2019

Also, MPI communication is done from the host, so every mat-vec will do a
"CopySome" call from the device, do MPI comms, and then the next time you
do GPU work it will copy from the host to get the update.

On Tue, Jul 16, 2019 at 10:22 PM Matthew Knepley via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> On Tue, Jul 16, 2019 at 9:07 PM Xiangdong via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Hello everyone,
>>
>> I am new to petsc gpu and have a simple question.
>>
>> When I tried to solve Ax=b where A is MATAIJCUSPARSE and b and x are
>> VECSEQCUDA  with GMRES(or GCR) and pcnone, I found that during each krylov
>> iteration, there are one call MemCpy(HtoD) and one call MemCpy(DtoH). Does
>> that mean the Krylov solve is not 100% on GPU and the solve still needs
>> some work from CPU? What are these MemCpys for during the each iteration?
>>
>
> We have GPU experts on the list, but there is definitely a communication
> because we do not do orthogonalization on the GPU,
> just the BLAS ops. This is a very small amount of data, so it just
> contributed latency, and I would guess that it is less then kernel
> launch latency.
>
>   Thanks,
>
>     Matt
>
>
>> Thank you.
>>
>> Best,
>> Xiangdong
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190717/5d32be0d/attachment.html>