[petsc-dev] [GPU] Performance on Fermi
Barry Smith
bsmith at mcs.anl.gov
Fri Aug 27 15:37:25 CDT 2010
##########################################################
# #
# WARNING!!! #
# #
# This code was compiled with a debugging option, #
# To get timing results run ./configure #
# using --with-debugging=no, the performance will #
# be generally two or three times faster. #
# #
##########################################################
You need to build the code with ./configure --with-debugging=0 to make a far comparison. This will speed up the CPU version.
Barry
On Aug 27, 2010, at 2:22 PM, Keita Teranishi wrote:
> Barry,
>
> CPU version takes another digit. So it is 1.6 sec on Fermi and 17 sec 1 core CPU.
>
> Thanks,
> ================================
> Keita Teranishi
> Scientific Library Group
> Cray, Inc.
> keita at cray.com
> ================================
>
>
> -----Original Message-----
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Keita Teranishi
> Sent: Friday, August 27, 2010 2:20 PM
> To: For users of the development version of PETSc
> Subject: Re: [petsc-dev] [GPU] Performance on Fermi
>
> Barry,
>
> Yes. It improves the performance dramatically, but the execution time for KSPSolve stays the same.
>
> MatMult 5.2 Gflops
>
> Thanks,
>
> ================================
> Keita Teranishi
> Scientific Library Group
> Cray, Inc.
> keita at cray.com
> ================================
>
>
> -----Original Message-----
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Friday, August 27, 2010 2:15 PM
> To: For users of the development version of PETSc
> Subject: [petsc-dev] [GPU] Performance on Fermi
>
>
> PETSc-dev folks,
>
> Please prepend all messages to petsc-dev that involve GPUs with [GPU] so they can be easily filtered.
>
> Keita,
>
> To run src/ksp/ksp/examples/tutorials/ex2.c with CUDA you need the flag -vec_type cuda
>
> Note also that this example is fine for simple ONE processor tests but should not be used for parallel testing because it does not do a proper parallel partitioning for performance
>
> Barry
>
> On Aug 27, 2010, at 2:04 PM, Keita Teranishi wrote:
>
>> Hi,
>>
>> I ran ex2.c with a matrix from 512x512 grid.
>> I set CG and Jacobi for the solver and preconditioner.
>> GCC-4.4.4 and CUDA-3.1 are used to compile the code.
>> BLAS and LAPAKCK are not optimized.
>>
>> MatMult
>> Fermi: 1142 MFlops
>> 1 core Istanbul: 420 MFlops
>>
>> KSPSolve:
>> Fermi: 1.5 Sec
>> 1 core Istanbul: 1.7 Sec
>>
>>
>> ================================
>> Keita Teranishi
>> Scientific Library Group
>> Cray, Inc.
>> keita at cray.com
>> ================================
>>
>>
>> -----Original Message-----
>> From: petsc-dev-bounces at mcs.anl.gov
>> [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Satish Balay
>> Sent: Friday, August 27, 2010 1:49 PM
>> To: For users of the development version of PETSc
>> Subject: Re: [petsc-dev] Problem with petsc-dev
>>
>> On Fri, 27 Aug 2010, Satish Balay wrote:
>>
>>> There was a problem with tarball creation for the past few days. Will
>>> try to respin manually today - and update you.
>>
>> the petsc-dev tarball is now updated on the website..
>>
>> Satish
>
More information about the petsc-dev
mailing list