[petsc-users] Scaling/Preconditioners for Poisson equation
Matthew Knepley
knepley at gmail.com
Mon Sep 29 14:39:56 CDT 2014
On Mon, Sep 29, 2014 at 9:36 AM, Filippo Leonardi <
filippo.leonardi at sam.math.ethz.ch> wrote:
> Thank you.
>
> Actually I had the feeling that it wasn't my problem with Bjacobi and CG.
>
> So I'll stick to MG. Problem with MG is that there are a lot of parameters
> to
> be tuned, so I leave the defaults (expect I select CG as Krylow method). I
> post just results for 64^3 and 128^3. Tell me if I'm missing some useful
> detail. (I get similar results with BoomerAMG).
>
1) I assume we are looking at ProjStage?
2) Why are you doing a different number of solves on the different number
of processes?
Matt
> Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin
> -pc_type
> mg):
> 32^3 and 1 proc: 1.01e-1
> 64^3 and 8 proc: 6.56e-01
> 128^3 and 64 proc: 1.05e+00
> Number of PCSetup per KSPSolve:
> 15
> 39
> 65
>
> With BoomerAMG:
> stable 8 iterations per KSP but time per iteration greater than PETSc MG
> and
> still increases:
> 64^3: 3.17e+00
> 128^3: 9.99e+00
>
>
> --> For instance with 64^3 (256 iterations):
>
> Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012
>
> Max Max/Min Avg Total
> Time (sec): 1.896e+02 1.00000 1.896e+02
> Objects: 7.220e+03 1.00000 7.220e+03
> Flops: 3.127e+10 1.00000 3.127e+10 2.502e+11
> Flops/sec: 1.649e+08 1.00000 1.649e+08 1.319e+09
> MPI Messages: 9.509e+04 1.00316 9.483e+04 7.586e+05
> MPI Message Lengths: 1.735e+09 1.09967 1.685e+04 1.278e+10
> MPI Reductions: 4.781e+04 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length N
> -->
> 2N flops
> and VecAXPY() for complex vectors of length N
> -->
> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages ---
> -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts %Total
> Avg %Total counts %Total
> 0: Main Stage: 1.3416e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0%
> 0.000e+00 0.0% 0.000e+00 0.0%
> 1: StepStage: 8.7909e-01 0.5% 1.8119e+09 0.7% 0.000e+00 0.0%
> 0.000e+00 0.0% 0.000e+00 0.0%
> 2: ConvStage: 1.7172e+01 9.1% 9.2610e+09 3.7% 1.843e+05 24.3%
> 3.981e+03 23.6% 0.000e+00 0.0%
> 3: ProjStage: 1.6804e+02 88.6% 2.3813e+11 95.2% 5.703e+05 75.2%
> 1.232e+04 73.1% 4.627e+04 96.8%
> 4: IoStage: 1.5814e+00 0.8% 0.0000e+00 0.0% 1.420e+03 0.2%
> 4.993e+02 3.0% 2.500e+02 0.5%
> 5: SolvAlloc: 2.5722e-01 0.1% 0.0000e+00 0.0% 2.560e+02 0.0%
> 1.054e+00 0.0% 3.330e+02 0.7%
> 6: SolvSolve: 1.6776e+00 0.9% 9.5345e+08 0.4% 2.280e+03 0.3%
> 4.924e+01 0.3% 9.540e+02 2.0%
> 7: SolvDeall: 7.4017e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0%
> 0.000e+00 0.0% 0.000e+00 0.0%
>
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting
> output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all processors
> Mess: number of messages sent
> Avg. len: average message length
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
> %T - percent time in this phase %f - percent flops in this
> phase
> %M - percent messages in this phase %L - percent message lengths
> in
> this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all
> processors)
>
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg len
> Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
>
> --- Event Stage 1: StepStage
>
> VecAXPY 3072 1.0 8.8295e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 1 0 0 0 99100 0 0 0 2052
>
> --- Event Stage 2: ConvStage
>
> VecCopy 4608 1.0 1.6016e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 9 0 0 0 0 0
> VecAXPY 4608 1.0 1.2212e+00 1.2 3.02e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 1 0 0 0 6 26 0 0 0 1978
> VecAXPBYCZ 5376 1.0 2.5875e+00 1.1 7.05e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 2 0 0 0 15 61 0 0 0 2179
> VecPointwiseMult 4608 1.0 1.4411e+00 1.0 1.51e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 8 13 0 0 0 838
> VecScatterBegin 7680 1.0 3.4130e+00 1.0 0.00e+00 0.0 1.8e+05 1.6e+04
> 0.0e+00 2 0 24 24 0 20 0100100 0 0
> VecScatterEnd 7680 1.0 9.3412e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 5 0 0 0 0 0
>
> --- Event Stage 3: ProjStage
>
> VecMDot 2560 1.0 2.1944e+00 1.1 9.23e+08 1.0 0.0e+00 0.0e+00
> 2.6e+03 1 3 0 0 5 1 3 0 0 6 3364
> VecTDot 19924 1.0 2.7283e+00 1.3 1.31e+09 1.0 0.0e+00 0.0e+00
> 2.0e+04 1 4 0 0 42 1 4 0 0 43 3829
> VecNorm 13034 1.0 1.5385e+00 2.0 8.54e+08 1.0 0.0e+00 0.0e+00
> 1.3e+04 1 3 0 0 27 1 3 0 0 28 4442
> VecScale 13034 1.0 9.0783e-01 1.3 4.27e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 1 0 0 0 0 1 0 0 0 3764
> VecCopy 21972 1.0 3.5136e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
> VecSet 21460 1.0 1.3108e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> VecAXPY 41384 1.0 5.9866e+00 1.1 2.71e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 3 9 0 0 0 3 9 0 0 0 3624
> VecAYPX 30142 1.0 5.3362e+00 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 3 5 0 0 0 3 6 0 0 0 2460
> VecMAXPY 2816 1.0 1.8561e+00 1.0 1.09e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 3 0 0 0 1 4 0 0 0 4700
> VecScatterBegin 23764 1.0 1.7138e+00 1.1 0.00e+00 0.0 5.7e+05 1.6e+04
> 0.0e+00 1 0 75 73 0 1 0100100 0 0
> VecScatterEnd 23764 1.0 3.1986e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> VecNormalize 2816 1.0 2.9511e-01 1.1 2.77e+08 1.0 0.0e+00 0.0e+00
> 2.8e+03 0 1 0 0 6 0 1 0 0 6 7504
> MatMult 22740 1.0 4.6896e+01 1.0 1.04e+10 1.0 5.5e+05 1.6e+04
> 0.0e+00 25 33 72 70 0 28 35 96 96 0 1780
> MatSOR 23252 1.0 9.5250e+01 1.0 1.04e+10 1.0 0.0e+00 0.0e+00
> 0.0e+00 50 33 0 0 0 56 35 0 0 0 872
> KSPGMRESOrthog 2560 1.0 3.6142e+00 1.1 1.85e+09 1.0 0.0e+00 0.0e+00
> 2.6e+03 2 6 0 0 5 2 6 0 0 6 4085
> KSPSetUp 768 1.0 7.9389e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 5.6e+03 0 0 0 0 12 0 0 0 0 12 0
> KSPSolve 256 1.0 1.6661e+02 1.0 2.97e+10 1.0 5.5e+05 1.6e+04
> 4.6e+04 88 95 72 70 97 99100 96 96100 1427
> PCSetUp 256 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.5e+03 0 0 0 0 3 0 0 0 0 3 0
> PCApply 10218 1.0 1.3642e+02 1.0 2.12e+10 1.0 3.1e+05 1.6e+04
> 1.3e+04 72 68 40 39 27 81 71 54 54 28 1245
>
> --- Event Stage 4: IoStage
>
> VecView 50 1.0 8.8377e-0138.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+02 0 0 0 0 0 29 0 0 0 40 0
> VecCopy 50 1.0 8.9977e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecScatterBegin 30 1.0 1.0644e-02 1.6 0.00e+00 0.0 7.2e+02 1.6e+04
> 0.0e+00 0 0 0 0 0 1 0 51 3 0 0
> VecScatterEnd 30 1.0 2.4857e-01109.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 8 0 0 0 0 0
>
> --- Event Stage 5: SolvAlloc
>
> VecSet 50 1.0 1.9324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 7 0 0 0 0 0
> MatAssemblyBegin 4 1.0 5.0378e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00 0 0 0 0 0 1 0 0 0 2 0
> MatAssemblyEnd 4 1.0 1.5030e-02 1.0 0.00e+00 0.0 9.6e+01 4.1e+03
> 1.6e+01 0 0 0 0 0 6 0 38 49 5 0
>
> --- Event Stage 6: SolvSolve
>
> VecMDot 10 1.0 8.9154e-03 1.1 3.60e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01 0 0 0 0 0 0 3 0 0 1 3234
> VecTDot 80 1.0 1.1104e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00
> 8.0e+01 0 0 0 0 0 1 4 0 0 8 3777
> VecNorm 820 1.0 2.6904e-01 1.6 3.41e+06 1.0 0.0e+00 0.0e+00
> 8.2e+02 0 0 0 0 2 13 3 0 0 86 101
> VecScale 52 1.0 3.6066e-03 1.2 1.70e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 1 0 0 0 3780
> VecCopy 91 1.0 1.4363e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecSet 86 1.0 5.1112e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 169 1.0 2.4659e-02 1.1 1.11e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 9 0 0 0 3593
> VecAYPX 121 1.0 2.2017e-02 1.1 6.59e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 6 0 0 0 2393
> VecMAXPY 11 1.0 7.2782e-03 1.0 4.26e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 4 0 0 0 4682
> VecScatterBegin 95 1.0 7.3617e-03 1.1 0.00e+00 0.0 2.3e+03 1.6e+04
> 0.0e+00 0 0 0 0 0 0 0100100 0 0
> VecScatterEnd 95 1.0 1.3788e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecNormalize 11 1.0 1.2109e-03 1.1 1.08e+06 1.0 0.0e+00 0.0e+00
> 1.1e+01 0 0 0 0 0 0 1 0 0 1 7144
> MatMult 91 1.0 1.9398e-01 1.0 4.17e+07 1.0 2.2e+03 1.6e+04
> 0.0e+00 0 0 0 0 0 11 35 96 96 0 1722
> MatSOR 93 1.0 3.8194e-01 1.0 4.16e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 23 35 0 0 0 870
> KSPGMRESOrthog 10 1.0 1.4540e-02 1.1 7.21e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01 0 0 0 0 0 1 6 0 0 1 3966
> KSPSetUp 3 1.0 5.2021e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01 0 0 0 0 0 0 0 0 0 3 0
> KSPSolve 1 1.0 6.7911e-01 1.0 1.19e+08 1.0 2.2e+03 1.6e+04
> 1.9e+02 0 0 0 0 0 40100 96 96 19 1399
> PCSetUp 1 1.0 1.9128e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00 0 0 0 0 0 0 0 0 0 1 0
> PCApply 41 1.0 5.5355e-01 1.0 8.47e+07 1.0 1.2e+03 1.6e+04
> 5.1e+01 0 0 0 0 0 33 71 54 54 5 1224
>
> --- Event Stage 7: SolvDeall
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> Viewer 1 0 0 0
>
> --- Event Stage 1: StepStage
>
>
> --- Event Stage 2: ConvStage
>
>
> --- Event Stage 3: ProjStage
>
> Vector 5376 5376 1417328640 0
> Krylov Solver 768 768 8298496 0
> Preconditioner 768 768 645120 0
>
> --- Event Stage 4: IoStage
>
> Vector 50 50 13182000 0
> Viewer 50 50 34400 0
>
> --- Event Stage 5: SolvAlloc
>
> Vector 140 6 8848 0
> Vector Scatter 6 0 0 0
> Matrix 6 0 0 0
> Distributed Mesh 2 0 0 0
> Bipartite Graph 4 0 0 0
> Index Set 14 14 372400 0
> IS L to G Mapping 3 0 0 0
> Krylov Solver 2 0 0 0
> Preconditioner 2 0 0 0
>
> --- Event Stage 6: SolvSolve
>
> Vector 22 0 0 0
> Krylov Solver 3 2 2296 0
> Preconditioner 3 2 1760 0
>
> --- Event Stage 7: SolvDeall
>
> Vector 0 149 41419384 0
> Vector Scatter 0 1 1036 0
> Matrix 0 3 4619676 0
> Krylov Solver 0 3 32416 0
> Preconditioner 0 3 2520 0
>
> ========================================================================================================================
> Average time to get PetscTime(): 1.90735e-07
> Average time for MPI_Barrier(): 4.62532e-06
> Average time for zero size MPI_Send(): 1.51992e-06
> #PETSc Option Table entries:
> -ksp_type cg
> -log_summary
> -pc_mg_galerkin
> -pc_type mg
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at:
> Configure options:
>
> --> And with 128^3 (512 iterations):
>
> Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012
>
> Max Max/Min Avg Total
> Time (sec): 5.889e+02 1.00000 5.889e+02
> Objects: 1.413e+04 1.00000 1.413e+04
> Flops: 9.486e+10 1.00000 9.486e+10 6.071e+12
> Flops/sec: 1.611e+08 1.00000 1.611e+08 1.031e+10
> MPI Messages: 5.392e+05 1.00578 5.361e+05 3.431e+07
> MPI Message Lengths: 6.042e+09 1.36798 8.286e+03 2.843e+11
> MPI Reductions: 1.343e+05 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length N
> -->
> 2N flops
> and VecAXPY() for complex vectors of length N
> -->
> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages ---
> -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts %Total
> Avg %Total counts %Total
> 0: Main Stage: 1.1330e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0%
> 0.000e+00 0.0% 0.000e+00 0.0%
> 1: StepStage: 1.7508e+00 0.3% 2.8991e+10 0.5% 0.000e+00 0.0%
> 0.000e+00 0.0% 0.000e+00 0.0%
> 2: ConvStage: 3.5534e+01 6.0% 1.4818e+11 2.4% 5.898e+06 17.2%
> 1.408e+03 17.0% 0.000e+00 0.0%
> 3: ProjStage: 5.3568e+02 91.0% 5.8820e+12 96.9% 2.833e+07 82.6%
> 6.765e+03 81.6% 1.319e+05 98.2%
> 4: IoStage: 1.1365e+01 1.9% 0.0000e+00 0.0% 1.782e+04 0.1%
> 9.901e+01 1.2% 2.500e+02 0.2%
> 5: SolvAlloc: 7.1497e-01 0.1% 0.0000e+00 0.0% 5.632e+03 0.0%
> 1.866e-01 0.0% 3.330e+02 0.2%
> 6: SolvSolve: 3.7604e+00 0.6% 1.1888e+10 0.2% 5.722e+04 0.2%
> 1.366e+01 0.2% 1.803e+03 1.3%
> 7: SolvDeall: 7.6677e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0%
> 0.000e+00 0.0% 0.000e+00 0.0%
>
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting
> output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all processors
> Mess: number of messages sent
> Avg. len: average message length
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
> %T - percent time in this phase %f - percent flops in this
> phase
> %M - percent messages in this phase %L - percent message lengths
> in
> this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all
> processors)
>
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg len
> Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
>
> --- Event Stage 1: StepStage
>
> VecAXPY 6144 1.0 1.8187e+00 1.1 4.53e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 99100 0 0 0 15941
>
> --- Event Stage 2: ConvStage
>
> VecCopy 9216 1.0 3.2440e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 9 0 0 0 0 0
> VecAXPY 9216 1.0 2.4045e+00 1.1 6.04e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 1 0 0 0 6 26 0 0 0 16076
> VecAXPBYCZ 10752 1.0 5.1656e+00 1.1 1.41e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 1 0 0 0 14 61 0 0 0 17460
> VecPointwiseMult 9216 1.0 2.9012e+00 1.0 3.02e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 8 13 0 0 0 6662
> VecScatterBegin 15360 1.0 7.3895e+00 1.3 0.00e+00 0.0 5.9e+06 8.2e+03
> 0.0e+00 1 0 17 17 0 18 0100100 0 0
> VecScatterEnd 15360 1.0 4.4483e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 10 0 0 0 0 0
>
> --- Event Stage 3: ProjStage
>
> VecMDot 5120 1.0 5.2159e+00 1.2 1.85e+09 1.0 0.0e+00 0.0e+00
> 5.1e+03 1 2 0 0 4 1 2 0 0 4 22644
> VecTDot 66106 1.0 1.3662e+01 1.4 4.33e+09 1.0 0.0e+00 0.0e+00
> 6.6e+04 2 5 0 0 49 2 5 0 0 50 20295
> VecNorm 39197 1.0 1.4431e+01 2.8 2.57e+09 1.0 0.0e+00 0.0e+00
> 3.9e+04 2 3 0 0 29 2 3 0 0 30 11392
> VecScale 39197 1.0 2.8002e+00 1.2 1.28e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 1 0 0 0 0 1 0 0 0 29356
> VecCopy 70202 1.0 1.1299e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
> VecSet 69178 1.0 3.9612e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> VecAXPY 135284 1.0 1.9286e+01 1.1 8.87e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 3 9 0 0 0 3 10 0 0 0 29422
> VecAYPX 99671 1.0 1.7862e+01 1.1 5.43e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 3 6 0 0 0 3 6 0 0 0 19464
> VecMAXPY 5632 1.0 3.7555e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 2 0 0 0 1 2 0 0 0 37169
> VecScatterBegin 73786 1.0 6.2463e+00 1.2 0.00e+00 0.0 2.8e+07 8.2e+03
> 0.0e+00 1 0 83 82 0 1 0100100 0 0
> VecScatterEnd 73786 1.0 2.1679e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
> VecNormalize 5632 1.0 9.0864e-01 1.2 5.54e+08 1.0 0.0e+00 0.0e+00
> 5.6e+03 0 1 0 0 4 0 1 0 0 4 38996
> MatMult 71738 1.0 1.5645e+02 1.1 3.29e+10 1.0 2.8e+07 8.2e+03
> 0.0e+00 26 35 80 79 0 28 36 97 97 0 13462
> MatSOR 72762 1.0 2.9900e+02 1.0 3.25e+10 1.0 0.0e+00 0.0e+00
> 0.0e+00 49 34 0 0 0 54 35 0 0 0 6953
> KSPGMRESOrthog 5120 1.0 8.0849e+00 1.1 3.69e+09 1.0 0.0e+00 0.0e+00
> 5.1e+03 1 4 0 0 4 1 4 0 0 4 29218
> KSPSetUp 1536 1.0 2.0613e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.1e+04 0 0 0 0 8 0 0 0 0 9 0
> KSPSolve 512 1.0 5.3248e+02 1.0 9.18e+10 1.0 2.8e+07 8.2e+03
> 1.3e+05 90 97 80 79 98 99100 97 97100 11034
> PCSetUp 512 1.0 5.6760e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 3.1e+03 0 0 0 0 2 0 0 0 0 2 0
> PCApply 33565 1.0 4.2495e+02 1.0 6.36e+10 1.0 1.5e+07 8.2e+03
> 2.6e+04 71 67 43 43 19 78 69 52 52 20 9585
>
> --- Event Stage 4: IoStage
>
> VecView 50 1.0 7.7463e+00240.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+02 1 0 0 0 0 34 0 0 0 40 0
> VecCopy 50 1.0 1.0773e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecScatterBegin 30 1.0 1.1727e-02 2.3 0.00e+00 0.0 1.2e+04 8.2e+03
> 0.0e+00 0 0 0 0 0 0 0 65 3 0 0
> VecScatterEnd 30 1.0 2.2058e+00701.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 10 0 0 0 0 0
>
> --- Event Stage 5: SolvAlloc
>
> VecSet 50 1.0 1.3748e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 14 0 0 0 0 0
> MatAssemblyBegin 4 1.0 3.1760e-0217.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00 0 0 0 0 0 2 0 0 0 2 0
> MatAssemblyEnd 4 1.0 2.1847e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03
> 1.6e+01 0 0 0 0 0 3 0 27 49 5 0
>
> --- Event Stage 6: SolvSolve
>
> VecMDot 10 1.0 1.2067e-02 1.5 3.60e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01 0 0 0 0 0 0 2 0 0 1 19117
> VecTDot 134 1.0 2.6145e-02 1.5 8.78e+06 1.0 0.0e+00 0.0e+00
> 1.3e+02 0 0 0 0 0 1 5 0 0 7 21497
> VecNorm 1615 1.0 1.4866e+00 3.5 5.18e+06 1.0 0.0e+00 0.0e+00
> 1.6e+03 0 0 0 0 1 29 3 0 0 90 223
> VecScale 79 1.0 5.9721e-03 1.2 2.59e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 1 0 0 0 27741
> VecCopy 145 1.0 2.4912e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecSet 140 1.0 7.9901e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 277 1.0 4.0597e-02 1.2 1.82e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 10 0 0 0 28619
> VecAYPX 202 1.0 3.5421e-02 1.1 1.10e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 6 0 0 0 19893
> VecMAXPY 11 1.0 7.7360e-03 1.1 4.26e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 2 0 0 0 35242
> VecScatterBegin 149 1.0 1.4983e-02 1.2 0.00e+00 0.0 5.7e+04 8.2e+03
> 0.0e+00 0 0 0 0 0 0 0100100 0 0
> VecScatterEnd 149 1.0 5.0236e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecNormalize 11 1.0 7.1080e-03 3.9 1.08e+06 1.0 0.0e+00 0.0e+00
> 1.1e+01 0 0 0 0 0 0 1 0 0 1 9736
> MatMult 145 1.0 3.2611e-01 1.1 6.65e+07 1.0 5.6e+04 8.2e+03
> 0.0e+00 0 0 0 0 0 8 36 97 97 0 13055
> MatSOR 147 1.0 6.0702e-01 1.0 6.57e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 16 35 0 0 0 6923
> KSPGMRESOrthog 10 1.0 1.7956e-02 1.3 7.21e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01 0 0 0 0 0 0 4 0 0 1 25694
> KSPSetUp 3 1.0 3.0483e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01 0 0 0 0 0 1 0 0 0 1 0
> KSPSolve 1 1.0 1.1431e+00 1.0 1.85e+08 1.0 5.6e+04 8.2e+03
> 2.7e+02 0 0 0 0 0 30100 97 97 15 10378
> PCSetUp 1 1.0 1.1488e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCApply 68 1.0 9.1644e-01 1.0 1.28e+08 1.0 3.0e+04 8.2e+03
> 5.1e+01 0 0 0 0 0 24 69 52 52 3 8959
>
> --- Event Stage 7: SolvDeall
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> Viewer 1 0 0 0
>
> --- Event Stage 1: StepStage
>
>
> --- Event Stage 2: ConvStage
>
>
> --- Event Stage 3: ProjStage
>
> Vector 10752 10752 2834657280 0
> Krylov Solver 1536 1536 16596992 0
> Preconditioner 1536 1536 1290240 0
>
> --- Event Stage 4: IoStage
>
> Vector 50 50 13182000 0
> Viewer 50 50 34400 0
>
> --- Event Stage 5: SolvAlloc
>
> Vector 140 6 8848 0
> Vector Scatter 6 0 0 0
> Matrix 6 0 0 0
> Distributed Mesh 2 0 0 0
> Bipartite Graph 4 0 0 0
> Index Set 14 14 372400 0
> IS L to G Mapping 3 0 0 0
> Krylov Solver 2 0 0 0
> Preconditioner 2 0 0 0
>
> --- Event Stage 6: SolvSolve
>
> Vector 22 0 0 0
> Krylov Solver 3 2 2296 0
> Preconditioner 3 2 1760 0
>
> --- Event Stage 7: SolvDeall
>
> Vector 0 149 41419384 0
> Vector Scatter 0 1 1036 0
> Matrix 0 3 4619676 0
> Krylov Solver 0 3 32416 0
> Preconditioner 0 3 2520 0
>
> ========================================================================================================================
> Average time to get PetscTime(): 9.53674e-08
> Average time for MPI_Barrier(): 1.13964e-05
> Average time for zero size MPI_Send(): 1.2815e-06
> #PETSc Option Table entries:
> -ksp_type cg
> -log_summary
> -pc_mg_galerkin
> -pc_type mg
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at:
> Configure options:
>
> Best,
> Filippo
>
> On Monday 29 September 2014 08:58:35 Matthew Knepley wrote:
> > On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi <
> >
> > filippo.leonardi at sam.math.ethz.ch> wrote:
> > > Hi,
> > >
> > > I am trying to solve a standard second order central differenced
> Poisson
> > > equation in parallel, in 3D, using a 3D structured DMDAs (extremely
> > > standard
> > > Laplacian matrix).
> > >
> > > I want to get some nice scaling (especially weak), but my results show
> > > that
> > > the Krylow method is not performing as expected. The problem (at leas
> for
> > > CG +
> > > Bjacobi) seems to lie on the number of iterations.
> > >
> > > In particular the number of iterations grows with CG (the matrix is
> SPD)
> > > +
> > > BJacobi as mesh is refined (probably due to condition number
> increasing)
> > > and
> > > number of processors is increased (probably due to the Bjacobi
> > > preconditioner). For instance I tried the following setup:
> > > 1 procs to solve 32^3 domain => 20 iterations
> > > 8 procs to solve 64^3 domain => 60 iterations
> > > 64 procs to solve 128^3 domain => 101 iterations
> > >
> > > Is there something pathological with my runs (maybe I am missing
> > > something)?
> > > Is there somebody who can provide me weak scaling benchmarks for
> > > equivalent
> > > problems? (Maybe there is some better preconditioner for this problem).
> >
> > Bjacobi is not a scalable preconditioner. As you note, the number of
> > iterates grows
> > with the system size. You should always use MG here.
> >
> > > I am also aware that Multigrid is even better for this problems but the
> > > **scalability** of my runs seems to be as bad as with CG.
> >
> > MG will weak scale almost perfectly. Send -log_summary for each run if
> this
> > does not happen.
> >
> > Thanks,
> >
> > Matt
> >
> > > -pc_mg_galerkin
> > > -pc_type mg
> > > (both directly with richardson or as preconditioner to cg)
> > >
> > > The following is the "-log_summary" of a 128^3 run, notice that I solve
> > > the
> > > system multiple times (hence KSPSolve is multiplied by 128). Using CG +
> > > BJacobi.
> > >
> > > Tell me if I missed some detail and sorry for the length of the post.
> > >
> > > Thanks,
> > > Filippo
> > >
> > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT
> 2012
> > >
> > > Max Max/Min Avg Total
> > >
> > > Time (sec): 9.095e+01 1.00001 9.095e+01
> > > Objects: 1.875e+03 1.00000 1.875e+03
> > > Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12
> > > Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10
> > > MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06
> > > MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10
> > > MPI Reductions: 4.136e+04 1.00000
> > >
> > > Flop counting convention: 1 flop = 1 real number operation of type
> > > (multiply/divide/add/subtract)
> > >
> > > e.g., VecAXPY() for real vectors of length
> N
> > >
> > > -->
> > > 2N flops
> > >
> > > and VecAXPY() for complex vectors of
> length N
> > >
> > > -->
> > > 8N flops
> > >
> > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages
> > > ---
> > > -- Message Lengths -- -- Reductions --
> > >
> > > Avg %Total Avg %Total counts
> > > %Total
> > >
> > > Avg %Total counts %Total
> > >
> > > 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00
> > > 0.0%
> > >
> > > 0.000e+00 0.0% 0.000e+00 0.0%
> > >
> > > 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00
> > > 0.0%
> > >
> > > 0.000e+00 0.0% 0.000e+00 0.0%
> > >
> > > 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06
> > > 22.1%
> > >
> > > 1.809e+03 21.8% 0.000e+00 0.0%
> > >
> > > 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06
> > > 77.1%
> > >
> > > 6.317e+03 76.3% 4.024e+04 97.3%
> > >
> > > 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03
> > > 0.1%
> > >
> > > 1.017e+02 1.2% 5.000e+01 0.1%
> > >
> > > 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03
> > > 0.1%
> > >
> > > 9.587e-01 0.0% 3.330e+02 0.8%
> > >
> > > 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04
> > > 0.7%
> > >
> > > 5.464e+01 0.7% 7.320e+02 1.8%
> > >
> > > 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00
> > > 0.0%
> > >
> > > 0.000e+00 0.0% 0.000e+00 0.0%
> > >
> > >
> > >
> --------------------------------------------------------------------------
> > > ---------------------------------------------- See the 'Profiling'
> chapter
> > > of the users' manual for details on
> > > interpreting
> > > output.
> > >
> > > Phase summary info:
> > > Count: number of times phase was executed
> > > Time and Flops: Max - maximum over all processors
> > >
> > > Ratio - ratio of maximum to minimum over all
> processors
> > >
> > > Mess: number of messages sent
> > > Avg. len: average message length
> > > Reduct: number of global reductions
> > > Global: entire computation
> > > Stage: stages of a computation. Set stages with PetscLogStagePush()
> and
> > >
> > > PetscLogStagePop().
> > >
> > > %T - percent time in this phase %f - percent flops in
> this
> > >
> > > phase
> > >
> > > %M - percent messages in this phase %L - percent message
> lengths
> > >
> > > in
> > > this phase
> > >
> > > %R - percent reductions in this phase
> > >
> > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > >
> > > over all
> > > processors)
> > >
> > >
> --------------------------------------------------------------------------
> > > ---------------------------------------------- Event
> Count
> > > Time (sec) Flops
> > > --- Global --- --- Stage --- Total
> > >
> > > Max Ratio Max Ratio Max Ratio Mess Avg
> len
> > >
> > > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
> > >
> > >
> --------------------------------------------------------------------------
> > > ----------------------------------------------
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > >
> > > --- Event Stage 1: StepStage
> > >
> > > VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 1 0 0 0 99100 0 0 0 15608
> > >
> > > --- Event Stage 2: ConvStage
> > >
> > > VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0
> > > VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758
> > > VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306
> > > VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677
> > > VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06
> 8.2e+03
> > > 0.0e+00 2 0 22 22 0 18 0100100 0 0
> > > VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0
> > >
> > > --- Event Stage 3: ProjStage
> > >
> > > VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00
> 0.0e+00
> > > 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433
> > > VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00
> 0.0e+00
> > > 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759
> > > VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> > > VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490
> > > VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997
> > > VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06
> 8.2e+03
> > > 0.0e+00 1 0 77 76 0 1 0100100 0 0
> > > VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
> > > MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06
> 8.2e+03
> > > 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810
> > > MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182
> > > MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358
> > > MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 1.3e+02 1 0 0 0 0 1 0 0 0 0 0
> > > MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 2.6e+02 0 0 0 0 1 0 0 0 0 1 0
> > > KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 7.7e+02 0 0 0 0 2 0 0 0 0 2 0
> > > KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06
> 8.2e+03
> > > 4.0e+04 84 95 74 73 97 99100 96 96100 13800
> > > PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00
> 0.0e+00
> > > 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619
> > > PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00
> 0.0e+00
> > > 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652
> > > PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844
> > >
> > > --- Event Stage 4: IoStage
> > >
> > > VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 2.0e+01 1 0 0 0 0 36 0 0 0 40 0
> > > VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03
> 8.2e+03
> > > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0
> > > VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 9 0 0 0 0 0
> > >
> > > --- Event Stage 5: SolvAlloc
> > >
> > > VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 13 0 0 0 0 0
> > > MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 8.0e+00 0 0 0 0 0 3 0 0 0 2 0
> > > MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03
> 2.0e+03
> > > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0
> > >
> > > --- Event Stage 6: SolvSolve
> > >
> > > VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00
> 0.0e+00
> > > 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499
> > > VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00
> 0.0e+00
> > > 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104
> > > VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934
> > > VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142
> > > VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04
> 8.2e+03
> > > 0.0e+00 0 0 1 1 0 1 0100100 0 0
> > > VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 2 0 0 0 0 0
> > > MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04
> 8.2e+03
> > > 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711
> > > MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217
> > > MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259
> > > MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 1.0e+00 0 0 0 0 0 2 0 0 0 0 0
> > > MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 6.0e+00 0 0 0 0 0 1 0 0 0 1 0
> > > KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04
> 8.2e+03
> > > 3.5e+02 1 1 1 1 1 58100 97 97 48 13069
> > > PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00
> 0.0e+00
> > > 5.0e+00 0 0 0 0 0 3 0 0 0 1 996
> > > PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00
> 0.0e+00
> > > 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002
> > > PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877
> > >
> > > --- Event Stage 7: SolvDeall
> > >
> > >
> > >
> --------------------------------------------------------------------------
> > > ----------------------------------------------
> > >
> > > Memory usage is given in bytes:
> > >
> > > Object Type Creations Destructions Memory Descendants'
> > > Mem.
> > > Reports information only for process 0.
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > > Viewer 1 0 0 0
> > >
> > > --- Event Stage 1: StepStage
> > >
> > >
> > > --- Event Stage 2: ConvStage
> > >
> > >
> > > --- Event Stage 3: ProjStage
> > >
> > > Vector 640 640 101604352 0
> > > Matrix 128 128 410327040 0
> > >
> > > Index Set 384 384 17062912 0
> > >
> > > Krylov Solver 256 256 282624 0
> > >
> > > Preconditioner 256 256 228352 0
> > >
> > > --- Event Stage 4: IoStage
> > >
> > > Vector 10 10 2636400 0
> > > Viewer 10 10 6880 0
> > >
> > > --- Event Stage 5: SolvAlloc
> > >
> > > Vector 140 6 8848 0
> > >
> > > Vector Scatter 6 0 0 0
> > >
> > > Matrix 6 0 0 0
> > >
> > > Distributed Mesh 2 0 0 0
> > >
> > > Bipartite Graph 4 0 0 0
> > >
> > > Index Set 14 14 372400 0
> > >
> > > IS L to G Mapping 3 0 0 0
> > >
> > > Krylov Solver 1 0 0 0
> > >
> > > Preconditioner 1 0 0 0
> > >
> > > --- Event Stage 6: SolvSolve
> > >
> > > Vector 5 0 0 0
> > > Matrix 1 0 0 0
> > >
> > > Index Set 3 0 0 0
> > >
> > > Krylov Solver 2 1 1136 0
> > >
> > > Preconditioner 2 1 824 0
> > >
> > > --- Event Stage 7: SolvDeall
> > >
> > > Vector 0 133 36676728 0
> > >
> > > Vector Scatter 0 1 1036 0
> > >
> > > Matrix 0 4 7038924 0
> > >
> > > Index Set 0 3 133304 0
> > >
> > > Krylov Solver 0 2 2208 0
> > >
> > > Preconditioner 0 2 1784 0
> > >
> > >
> ==========================================================================
> > > ============================================== Average time to get
> > > PetscTime(): 9.53674e-08
> > > Average time for MPI_Barrier(): 1.12057e-05
> > > Average time for zero size MPI_Send(): 1.3113e-06
> > > #PETSc Option Table entries:
> > > -ksp_type cg
> > > -log_summary
> > > -pc_type bjacobi
> > > #End of PETSc Option Table entries
> > > Compiled without FORTRAN kernels
> > > Compiled with full precision matrices (default)
> > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> > > Configure run at:
> > > Configure options:
> > > Application 9457215 resources: utime ~5920s, stime ~58s
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140929/291ac88a/attachment-0001.html>
More information about the petsc-users
mailing list