[petsc-dev] [GPU] Performance of ex19

Barry Smith bsmith at mcs.anl.gov
Tue Aug 31 14:38:37 CDT 2010


On Aug 31, 2010, at 3:36 PM, Matthew Knepley wrote:

> On Tue, Aug 31, 2010 at 7:17 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On Aug 31, 2010, at 3:14 PM, Keita Teranishi wrote:
> 
>> Does this PETSc use timers from CUDA?
> 
>    No, didn't know there were timers in CUDA. 
> 
> Yes, I use them when I really want to know how well I an utilizing the board, vs. how
> much improvement overall I can expect in the code. When compared with PETSc timers,
> they can give us an idea of the transfer overhead, which I do in my GPU FEM code.

   We have essentially no transfer in this example. It takes zero percent of the time.

   Barry

> 
>    Matt
>  
>    We actually want to use the real world timers because each method is actually a call on the CPU so real world time is what matters.
> 
>    Barry
> 
>>  
>> ================================
>>  Keita Teranishi
>>  Scientific Library Group
>>  Cray, Inc.
>>  keita at cray.com
>> ================================
>>  
>> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Tuesday, August 31, 2010 2:03 PM
>> To: For users of the development version of PETSc
>> Subject: Re: [petsc-dev] [GPU] Performance of ex19
>>  
>>  
>>   Your MatMult is now slower. Are your results reproducible, if you run 5 times how similar are them?
>>  
>>    Barry
>>  
>> On Aug 31, 2010, at 2:57 PM, Keita Teranishi wrote:
>> 
>> 
>> VecDot                 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecMDot             2024 1.0 1.1560e+00 1.0 2.54e+09 1.0 0.0e+00 0.0e+00 0.0e+00 18 29  0  0  0  32 29  0  0  0  2201
>> VecNorm             2096 1.0 3.5999e-01 1.0 1.68e+08 1.0 0.0e+00 0.0e+00 0.0e+00  6  2  0  0  0  10  2  0  0  0   466
>> VecScale            2092 1.0 2.1599e-01 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00  3  1  0  0  0   6  1  0  0  0   387
>> VecCopy             2072 1.0 5.5997e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     0
>> VecSet                70 1.0 8.0004e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY              108 1.0 2.7999e-02 1.0 8.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0   309
>> VecWAXPY              68 1.0 7.9999e-03 1.0 2.72e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   340
>> VecMAXPY            2092 1.0 5.8399e-01 1.0 2.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00  9 31  0  0  0  16 31  0  0  0  4634
>> VecScatterBegin        5 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecReduceArith         2 1.0 3.9999e-03 1.0 1.60e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    40
>> VecReduceComm          1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecCUDACopyTo         10 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecCUDACopyFrom        5 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> SNESSolve              1 1.0 3.6119e+00 1.0 8.87e+09 1.0 0.0e+00 0.0e+00 0.0e+00 56100  0  0  0 100100  0  0  0  2456
>> SNESLineSearch         2 1.0 4.0002e-03 1.0 5.49e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1374
>> SNESFunctionEval       3 1.0 4.0002e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   630
>> SNESJacobianEval       2 1.0 3.1199e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00  5  0  0  0  0   9  0  0  0  0   123
>> KSPGMRESOrthog      2024 1.0 1.7120e+00 1.0 5.09e+09 1.0 0.0e+00 0.0e+00 0.0e+00 26 57  0  0  0  47 57  0  0  0  2972
>> KSPSetup               2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve               2 1.0 3.2919e+00 1.0 8.83e+09 1.0 0.0e+00 0.0e+00 0.0e+00 51 99  0  0  0  91 99  0  0  0  2681
>> PCSetUp                2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> PCApply             2024 1.0 4.7998e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> MatMult             2092 1.0 8.9998e-01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 37  0  0  0  25 37  0  0  0  3689
>> MatAssemblyBegin       2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatAssemblyEnd         2 1.0 1.2000e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatZeroEntries         2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatFDColorApply        2 1.0 3.1199e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00  5  0  0  0  0   9  0  0  0  0   123
>> MatFDColorFunc        42 1.0 7.9999e-03 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4410
>>  
>>  
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100831/39613ee2/attachment.html>


More information about the petsc-dev mailing list