hypre preconditioners
Klaij, Christiaan
C.Klaij at marin.nl
Thu Jul 16 01:47:27 CDT 2009
Lisandro,
Thanks for your response! The velocity problem is segregated (I use BICG with Jacobi for the 3 linear systems) but these need (much) less iterations than the pressure problem. The pressure matrix changes at each solve. Also, I did try ML and, like you say, it needs about two times more iterations than boomerAMG. Overall, boomerAMG is a bit faster for my cases than ML.
Chris
-----Original Message-----
Date: Wed, 15 Jul 2009 13:23:19 -0300
From: Lisandro Dalcin <dalcinl at gmail.com>
Subject: Re: hypre preconditioners
To: PETSc users list <petsc-users at mcs.anl.gov>
Message-ID:
<e7ba66e40907150923naa1acf2he4f79d4e173b29c8 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Did you try Block-Jacobi for the velocity problem? If the matrix of
your presure problem changes in each solve (is this your case?) could
you try to use ML? In my little experience, ML leads to lower setup
times, but higher iteration counts (let say twice); perhaps it will be
faster than BommerAMG for you use case.
On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan<C.Klaij at marin.nl> wrote:
> Barry,
>
> Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners.
>
> Chris
>
> -----------------------------
> --- Jacobi preconditioner ---
> -----------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: jacobi
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 6.037e+02 ? ? ?1.00000 ? 6.037e+02
> Objects: ? ? ? ? ? ? ?9.270e+02 ? ? ?1.00000 ? 9.270e+02
> Flops: ? ? ? ? ? ? ? ?5.671e+10 ? ? ?1.00065 ? 5.669e+10 ?1.134e+11
> Flops/sec: ? ? ? ? ? ?9.393e+07 ? ? ?1.00065 ? 9.390e+07 ?1.878e+08
> MPI Messages: ? ? ? ? 1.780e+04 ? ? ?1.00000 ? 1.780e+04 ?3.561e+04
> MPI Message Lengths: ?5.239e+08 ? ? ?1.00000 ? 2.943e+04 ?1.048e+09
> MPI Reductions: ? ? ? 2.651e+04 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 6.0374e+02 100.0% ?1.1338e+11 100.0% ?3.561e+04 100.0% ?2.943e+04 ? ? ?100.0% ?5.302e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 ?2 14 ?0 ?0 59 ? 2 14 ?0 ?0 59 ?1249
> VecNorm ? ? ? ? ? ?16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 ?0 ?7 ?0 ?0 31 ? 0 ?7 ?0 ?0 31 ?3569
> VecCopy ? ? ? ? ? ? 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ?32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?3 15 ?0 ?0 ?0 ? 3 15 ?0 ?0 ?0 ? 864
> VecAYPX ? ? ? ? ? ?16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?8 ?0 ?0 ?0 ? 1 ?8 ?0 ?0 ?0 ?1144
> VecAssemblyBegin ? ?1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 ?7 ? 0 ?0 ?0 ?0 ?7 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 323
> VecScatterBegin ? ?17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ?17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 ?27100100100 90 ? 686
> PCSetUp ? ? ? ? ? ? ?600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> PCApply ? ? ? ? ? ?18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 322
> MatMult ? ? ? ? ? ?16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 ?0 ?15 47 91 91 ?0 ? 570
> MatMultTranspose ? ?1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 ?5 ?9 ?9 ?0 ? 1 ?5 ?9 ?9 ?0 ? 624
> MatAssemblyBegin ? ? 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?2 ? 0 ?0 ?0 ?0 ?2 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?1 ? 0 ?0 ?0 ?0 ?1 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ? 913 ? ? ? ? ? ?902 ?926180816 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Average time for MPI_Barrier(): 8.10623e-07
> Average time for zero size MPI_Send(): 2.0504e-05
>
>
>
> -----------------------------------
> --- Hypre Euclid preconditioner ---
> -----------------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: hypre
> ? ?HYPRE Euclid preconditioning
> ? ?HYPRE Euclid: number of levels 1
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 6.961e+02 ? ? ?1.00000 ? 6.961e+02
> Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03
> Flops: ? ? ? ? ? ? ? ?1.340e+10 ? ? ?1.00073 ? 1.340e+10 ?2.679e+10
> Flops/sec: ? ? ? ? ? ?1.925e+07 ? ? ?1.00073 ? 1.924e+07 ?3.848e+07
> MPI Messages: ? ? ? ? 4.748e+03 ? ? ?1.00000 ? 4.748e+03 ?9.496e+03
> MPI Message Lengths: ?1.397e+08 ? ? ?1.00000 ? 2.943e+04 ?2.794e+08
> MPI Reductions: ? ? ? 7.192e+03 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 6.9614e+02 100.0% ?2.6790e+10 100.0% ?9.496e+03 100.0% ?2.943e+04 ? ? ?100.0% ?1.438e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? ?5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 ?1 10 ?0 ?0 38 ? 1 10 ?0 ?0 38 ? 234
> VecNorm ? ? ? ? ? ? 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 ?0 ?6 ?0 ?0 23 ? 0 ?6 ?0 ?0 23 ?2139
> VecCopy ? ? ? ? ? ? 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ? 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 13 ?0 ?0 ?0 ? 1 13 ?0 ?0 ?0 ? 715
> VecAYPX ? ? ? ? ? ? 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 837
> VecAssemblyBegin ? ?1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 25 ? 0 ?0 ?0 ?0 25 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? ?3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?4 ?0 ?0 ?0 ? 1 ?4 ?0 ?0 ?0 ? 250
> VecScatterBegin ? ? 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ? 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 ?37100100100 62 ? 103
> PCSetUp ? ? ? ? ? ? ?600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 ?0 ?0 ?0 ?1 ?26 ?0 ?0 ?0 ?1 ? ? 0
> PCApply ? ? ? ? ? ? 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 ?5 ?4 ?0 ?0 ?1 ? 5 ?4 ?0 ?0 ?1 ? ?28
> MatMult ? ? ? ? ? ? 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 ?3 40 69 69 ?0 ? 3 40 69 69 ?0 ? 464
> MatMultTranspose ? ?1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 ?1 20 31 31 ?0 ? 1 20 31 31 ?0 ? 621
> MatConvert ? ? ? ? ? 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0
> MatAssemblyBegin ? ? 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?8 ? 0 ?0 ?0 ?0 ?8 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?4 ? 0 ?0 ?0 ?0 ?4 ? ? 0
> MatGetRow ? ? ? ?12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> MatGetRowIJ ? ? ? ? ?200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Average time for MPI_Barrier(): 3.8147e-07
> Average time for zero size MPI_Send(): 1.39475e-05
>
>
>
>
> --------------------------------------
> --- Hypre BoomerAMG preconditioner ---
> --------------------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: hypre
> ? ?HYPRE BoomerAMG preconditioning
> ? ?HYPRE BoomerAMG: Cycle type V
> ? ?HYPRE BoomerAMG: Maximum number of levels 25
> ? ?HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
> ? ?HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
> ? ?HYPRE BoomerAMG: Threshold for strong coupling 0.25
> ? ?HYPRE BoomerAMG: Interpolation truncation factor 0
> ? ?HYPRE BoomerAMG: Interpolation: max elements per row 0
> ? ?HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
> ? ?HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
> ? ?HYPRE BoomerAMG: Maximum row sums 0.9
> ? ?HYPRE BoomerAMG: Sweeps down ? ? ? ? 1
> ? ?HYPRE BoomerAMG: Sweeps up ? ? ? ? ? 1
> ? ?HYPRE BoomerAMG: Sweeps on coarse ? ?1
> ? ?HYPRE BoomerAMG: Relax down ? ? ? ? ?symmetric-SOR/Jacobi
> ? ?HYPRE BoomerAMG: Relax up ? ? ? ? ? ?symmetric-SOR/Jacobi
> ? ?HYPRE BoomerAMG: Relax on coarse ? ? Gaussian-elimination
> ? ?HYPRE BoomerAMG: Relax weight ?(all) ? ? ?1
> ? ?HYPRE BoomerAMG: Outer relax weight (all) 1
> ? ?HYPRE BoomerAMG: Using CF-relaxation
> ? ?HYPRE BoomerAMG: Measure type ? ? ? ?local
> ? ?HYPRE BoomerAMG: Coarsen type ? ? ? ?Falgout
> ? ?HYPRE BoomerAMG: Interpolation type ?classical
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 7.080e+02 ? ? ?1.00000 ? 7.080e+02
> Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03
> Flops: ? ? ? ? ? ? ? ?1.054e+10 ? ? ?1.00076 ? 1.054e+10 ?2.107e+10
> Flops/sec: ? ? ? ? ? ?1.489e+07 ? ? ?1.00076 ? 1.488e+07 ?2.977e+07
> MPI Messages: ? ? ? ? 3.857e+03 ? ? ?1.00000 ? 3.857e+03 ?7.714e+03
> MPI Message Lengths: ?1.135e+08 ? ? ?1.00000 ? 2.942e+04 ?2.270e+08
> MPI Reductions: ? ? ? 5.800e+03 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 7.0799e+02 100.0% ?2.1075e+10 100.0% ?7.714e+03 100.0% ?2.942e+04 ? ? ?100.0% ?1.160e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? ?3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?9 ?0 ?0 31 ? 0 ?9 ?0 ?0 31 ?1001
> VecNorm ? ? ? ? ? ? 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 ?0 ?6 ?0 ?0 20 ? 0 ?6 ?0 ?0 20 ?1781
> VecCopy ? ? ? ? ? ? 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ? 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 12 ?0 ?0 ?0 ? 1 12 ?0 ?0 ?0 ? 674
> VecAYPX ? ? ? ? ? ? 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 774
> VecAssemblyBegin ? ?1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 31 ? 0 ?0 ?0 ?0 31 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? ?4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?5 ?0 ?0 ?0 ? 1 ?5 ?0 ?0 ?0 ? 252
> VecScatterBegin ? ? 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ? 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 ?38100100100 53 ? ?77
> PCSetUp ? ? ? ? ? ? ?600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 ?0 ?0 ?0 ?2 ?23 ?0 ?0 ?0 ?2 ? ? 0
> PCApply ? ? ? ? ? ? 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 ?5 ?0 ?0 ?1 ?10 ?5 ?0 ?0 ?1 ? ?14
> MatMult ? ? ? ? ? ? 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 ?2 36 60 60 ?0 ? 2 36 60 60 ?0 ? 557
> MatMultTranspose ? ?1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 26 40 40 ?0 ? 1 26 40 40 ?0 ? 626
> MatConvert ? ? ? ? ? 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0
> MatAssemblyBegin ? ? 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 10 ? 0 ?0 ?0 ?0 10 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?5 ? 0 ?0 ?0 ?0 ?5 ? ? 0
> MatGetRow ? ? ? ?12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> MatGetRowIJ ? ? ? ? ?200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 1.90735e-07
> Average time for MPI_Barrier(): 8.10623e-07
> Average time for zero size MPI_Send(): 1.95503e-05
> OptionTable: -log_summary
>
>
>
>
> -----Original Message-----
> Date: Tue, 14 Jul 2009 10:42:58 -0500
> From: Barry Smith <bsmith at mcs.anl.gov>
> Subject: Re: hypre preconditioners
> To: PETSc users list <petsc-users at mcs.anl.gov>
> Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
>
> ? ?First run the three cases with -log_summary (also -ksp_view to see
> exact solver options that are being used) and send those files. This
> will tell us where the time is being spent; without this information
> any comments are pure speculation. (For example, the "copy" time to
> hypre format is trivial compared to the time to build a hypre
> preconditioner and not the problem).
>
>
> ? ?What you report is not uncommon; the setup and per iteration cost
> of the hypre preconditioners will be much larger than the simpler
> Jacobi preconditioner.
>
> ? ?Barry
>
> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:
>
>>
>> I'm solving the steady incompressible Navier-Stokes equations
>> (discretized with FV on unstructured grids) using the SIMPLE
>> Pressure Correction method. I'm using Picard linearization and solve
>> the system for the momentum equations with BICG and for the pressure
>> equation with CG. Currently, for parallel runs, I'm using JACOBI as
>> a preconditioner. My grids typically have a few million cells and I
>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux
>> cluster). A significant portion of the CPU time goes into solving
>> the pressure equation. To reach the relative tolerance I need, CG
>> with JACOBI takes about 100 iterations per outer loop for these
>> problems.
>>
>> In order to reduce CPU time, I've compiled PETSc with support for
>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a
>> preconditioner for the pressure equation. With default settings,
>> both BoomerAMG and Euclid greatly reduce the number of iterations:
>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.
>> However, I do not get any reduction in CPU time. With Euclid, CPU
>> time is similar to JACOBI and with BoomerAMG it is approximately
>> doubled.
>>
>> Is this what one can expect? Are BoomerAMG and Euclid meant for much
>> larger problems? I understand Hypre uses a different matrix storage
>> format, is CPU time 'lost in translation' between PETSc and Hypre
>> for these small problems? Are there maybe any settings I should
>> change?
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>>
>> <mime-attachment.jpeg><mime-attachment.jpeg>
>> dr. ir. Christiaan Klaij
>> CFD Researcher
>> Research & Development
>> MARIN
>> 2, Haagsteeg
>> c.klaij at marin.nl
>> P.O. Box 28
>> T +31 317 49 39 11
>> 6700 AA ?Wageningen
>> F +31 317 49 32 45
>> T ?+31 317 49 33 44
>> The Netherlands
>> I ?www.marin.nl
>>
>>
>> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>>
>>
>> This e-mail may be confidential, privileged and/or protected by
>> copyright. If you are not the intended recipient, you should return
>> it to the sender immediately and delete your copy from your system.
>>
>
--
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 15358 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/d99301d8/attachment-0001.bin>
More information about the petsc-users
mailing list