hypre preconditioners
Klaij, Christiaan
C.Klaij at marin.nl
Wed Jul 15 03:58:36 CDT 2009
Barry,
Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners.
Chris
-----------------------------
--- Jacobi preconditioner ---
-----------------------------
KSP Object:
type: cg
maximum iterations=500
tolerances: relative=0.05, absolute=1e-50, divergence=10000
left preconditioning
PC Object:
type: jacobi
linear system matrix = precond matrix:
Matrix Object:
type=mpiaij, rows=256576, cols=256576
total: nonzeros=1769552, allocated nonzeros=1769552
not using I-node (on process 0) routines
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009
Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
Max Max/Min Avg Total
Time (sec): 6.037e+02 1.00000 6.037e+02
Objects: 9.270e+02 1.00000 9.270e+02
Flops: 5.671e+10 1.00065 5.669e+10 1.134e+11
Flops/sec: 9.393e+07 1.00065 9.390e+07 1.878e+08
MPI Messages: 1.780e+04 1.00000 1.780e+04 3.561e+04
MPI Message Lengths: 5.239e+08 1.00000 2.943e+04 1.048e+09
MPI Reductions: 2.651e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 6.0374e+02 100.0% 1.1338e+11 100.0% 3.561e+04 100.0% 2.943e+04 100.0% 5.302e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 2 14 0 0 59 2 14 0 0 59 1249
VecNorm 16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 0 7 0 0 31 0 7 0 0 31 3569
VecCopy 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 15 0 0 0 3 15 0 0 0 864
VecAYPX 16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 8 0 0 0 1 8 0 0 0 1144
VecAssemblyBegin 1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 7 0 0 0 0 7 0
VecAssemblyEnd 1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecPointwiseMult 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 4 0 0 0 2 4 0 0 0 323
VecScatterBegin 17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0
VecScatterEnd 17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetup 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 27100100100 90 686
PCSetUp 600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
PCApply 18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 2 4 0 0 0 2 4 0 0 0 322
MatMult 16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 0 15 47 91 91 0 570
MatMultTranspose 1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 1 5 9 9 0 1 5 9 9 0 624
MatAssemblyBegin 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 2 0 0 0 0 2 0
MatAssemblyEnd 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 0 0 0 0 1 0 0 0 0 1 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Index Set 4 4 30272 0
Vec 913 902 926180816 0
Vec Scatter 2 0 0 0
Krylov Solver 1 0 0 0
Preconditioner 1 0 0 0
Matrix 6 0 0 0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 8.10623e-07
Average time for zero size MPI_Send(): 2.0504e-05
-----------------------------------
--- Hypre Euclid preconditioner ---
-----------------------------------
KSP Object:
type: cg
maximum iterations=500
tolerances: relative=0.05, absolute=1e-50, divergence=10000
left preconditioning
PC Object:
type: hypre
HYPRE Euclid preconditioning
HYPRE Euclid: number of levels 1
linear system matrix = precond matrix:
Matrix Object:
type=mpiaij, rows=256576, cols=256576
total: nonzeros=1769552, allocated nonzeros=1769552
not using I-node (on process 0) routines
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009
Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
Max Max/Min Avg Total
Time (sec): 6.961e+02 1.00000 6.961e+02
Objects: 1.227e+03 1.00000 1.227e+03
Flops: 1.340e+10 1.00073 1.340e+10 2.679e+10
Flops/sec: 1.925e+07 1.00073 1.924e+07 3.848e+07
MPI Messages: 4.748e+03 1.00000 4.748e+03 9.496e+03
MPI Message Lengths: 1.397e+08 1.00000 2.943e+04 2.794e+08
MPI Reductions: 7.192e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 6.9614e+02 100.0% 2.6790e+10 100.0% 9.496e+03 100.0% 2.943e+04 100.0% 1.438e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 1 10 0 0 38 1 10 0 0 38 234
VecNorm 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 0 6 0 0 23 0 6 0 0 23 2139
VecCopy 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 13 0 0 0 1 13 0 0 0 715
VecAYPX 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 837
VecAssemblyBegin 1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 25 0 0 0 0 25 0
VecAssemblyEnd 1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecPointwiseMult 3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 250
VecScatterBegin 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0
VecScatterEnd 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetup 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 37100100100 62 103
PCSetUp 600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 0 0 0 1 26 0 0 0 1 0
PCApply 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 5 4 0 0 1 5 4 0 0 1 28
MatMult 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 3 40 69 69 0 3 40 69 69 0 464
MatMultTranspose 1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 1 20 31 31 0 1 20 31 31 0 621
MatConvert 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatAssemblyBegin 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 8 0 0 0 0 8 0
MatAssemblyEnd 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 0 0 0 0 4 0 0 0 0 4 0
MatGetRow 12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatGetRowIJ 200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Index Set 4 4 30272 0
Vec 1213 1202 1234223216 0
Vec Scatter 2 0 0 0
Krylov Solver 1 0 0 0
Preconditioner 1 0 0 0
Matrix 6 0 0 0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 3.8147e-07
Average time for zero size MPI_Send(): 1.39475e-05
--------------------------------------
--- Hypre BoomerAMG preconditioner ---
--------------------------------------
KSP Object:
type: cg
maximum iterations=500
tolerances: relative=0.05, absolute=1e-50, divergence=10000
left preconditioning
PC Object:
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Matrix Object:
type=mpiaij, rows=256576, cols=256576
total: nonzeros=1769552, allocated nonzeros=1769552
not using I-node (on process 0) routines
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009
Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
Max Max/Min Avg Total
Time (sec): 7.080e+02 1.00000 7.080e+02
Objects: 1.227e+03 1.00000 1.227e+03
Flops: 1.054e+10 1.00076 1.054e+10 2.107e+10
Flops/sec: 1.489e+07 1.00076 1.488e+07 2.977e+07
MPI Messages: 3.857e+03 1.00000 3.857e+03 7.714e+03
MPI Message Lengths: 1.135e+08 1.00000 2.942e+04 2.270e+08
MPI Reductions: 5.800e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 7.0799e+02 100.0% 2.1075e+10 100.0% 7.714e+03 100.0% 2.942e+04 100.0% 1.160e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 0 9 0 0 31 0 9 0 0 31 1001
VecNorm 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 0 6 0 0 20 0 6 0 0 20 1781
VecCopy 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 12 0 0 0 1 12 0 0 0 674
VecAYPX 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 774
VecAssemblyBegin 1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 31 0 0 0 0 31 0
VecAssemblyEnd 1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecPointwiseMult 4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 252
VecScatterBegin 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0
VecScatterEnd 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetup 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 38100100100 53 77
PCSetUp 600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 0 0 0 2 23 0 0 0 2 0
PCApply 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 5 0 0 1 10 5 0 0 1 14
MatMult 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 2 36 60 60 0 2 36 60 60 0 557
MatMultTranspose 1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 1 26 40 40 0 1 26 40 40 0 626
MatConvert 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatAssemblyBegin 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 10 0 0 0 0 10 0
MatAssemblyEnd 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 0 0 0 0 5 0 0 0 0 5 0
MatGetRow 12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatGetRowIJ 200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Index Set 4 4 30272 0
Vec 1213 1202 1234223216 0
Vec Scatter 2 0 0 0
Krylov Solver 1 0 0 0
Preconditioner 1 0 0 0
Matrix 6 0 0 0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 8.10623e-07
Average time for zero size MPI_Send(): 1.95503e-05
OptionTable: -log_summary
-----Original Message-----
Date: Tue, 14 Jul 2009 10:42:58 -0500
From: Barry Smith <bsmith at mcs.anl.gov>
Subject: Re: hypre preconditioners
To: PETSc users list <petsc-users at mcs.anl.gov>
Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
First run the three cases with -log_summary (also -ksp_view to see
exact solver options that are being used) and send those files. This
will tell us where the time is being spent; without this information
any comments are pure speculation. (For example, the "copy" time to
hypre format is trivial compared to the time to build a hypre
preconditioner and not the problem).
What you report is not uncommon; the setup and per iteration cost
of the hypre preconditioners will be much larger than the simpler
Jacobi preconditioner.
Barry
On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:
>
> I'm solving the steady incompressible Navier-Stokes equations
> (discretized with FV on unstructured grids) using the SIMPLE
> Pressure Correction method. I'm using Picard linearization and solve
> the system for the momentum equations with BICG and for the pressure
> equation with CG. Currently, for parallel runs, I'm using JACOBI as
> a preconditioner. My grids typically have a few million cells and I
> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux
> cluster). A significant portion of the CPU time goes into solving
> the pressure equation. To reach the relative tolerance I need, CG
> with JACOBI takes about 100 iterations per outer loop for these
> problems.
>
> In order to reduce CPU time, I've compiled PETSc with support for
> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a
> preconditioner for the pressure equation. With default settings,
> both BoomerAMG and Euclid greatly reduce the number of iterations:
> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.
> However, I do not get any reduction in CPU time. With Euclid, CPU
> time is similar to JACOBI and with BoomerAMG it is approximately
> doubled.
>
> Is this what one can expect? Are BoomerAMG and Euclid meant for much
> larger problems? I understand Hypre uses a different matrix storage
> format, is CPU time 'lost in translation' between PETSc and Hypre
> for these small problems? Are there maybe any settings I should
> change?
>
> Chris
>
>
>
>
>
>
>
>
> <mime-attachment.jpeg><mime-attachment.jpeg>
> dr. ir. Christiaan Klaij
> CFD Researcher
> Research & Development
> MARIN
> 2, Haagsteeg
> c.klaij at marin.nl
> P.O. Box 28
> T +31 317 49 39 11
> 6700 AA Wageningen
> F +31 317 49 32 45
> T +31 317 49 33 44
> The Netherlands
> I www.marin.nl
>
>
> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>
>
> This e-mail may be confidential, privileged and/or protected by
> copyright. If you are not the intended recipient, you should return
> it to the sender immediately and delete your copy from your system.
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 14202 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090715/a0229c8c/attachment-0001.bin>
More information about the petsc-users
mailing list