[petsc-users] Scaling/Preconditioners for Poisson equation

Matthew Knepley knepley at gmail.com
Mon Sep 29 14:39:56 CDT 2014


On Mon, Sep 29, 2014 at 9:36 AM, Filippo Leonardi <
filippo.leonardi at sam.math.ethz.ch> wrote:

> Thank you.
>
> Actually I had the feeling that it wasn't my problem with Bjacobi and CG.
>
> So I'll stick to MG. Problem with MG is that there are a lot of parameters
> to
> be tuned, so I leave the defaults (expect I select CG as Krylow method). I
> post just results for 64^3 and 128^3. Tell me if I'm missing some useful
> detail. (I get similar results with BoomerAMG).
>

1) I assume we are looking at ProjStage?

2) Why are you doing a different number of solves on the different number
of processes?

   Matt


> Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin
> -pc_type
> mg):
> 32^3 and 1 proc: 1.01e-1
> 64^3 and 8 proc: 6.56e-01
> 128^3 and 64 proc: 1.05e+00
> Number of PCSetup per KSPSolve:
> 15
> 39
> 65
>
> With BoomerAMG:
> stable 8 iterations per KSP but time per iteration greater than PETSc MG
> and
> still increases:
> 64^3:  3.17e+00
> 128^3: 9.99e+00
>
>
> --> For instance with 64^3 (256 iterations):
>
> Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012
>
>                          Max       Max/Min        Avg      Total
> Time (sec):           1.896e+02      1.00000   1.896e+02
> Objects:              7.220e+03      1.00000   7.220e+03
> Flops:                3.127e+10      1.00000   3.127e+10  2.502e+11
> Flops/sec:            1.649e+08      1.00000   1.649e+08  1.319e+09
> MPI Messages:         9.509e+04      1.00316   9.483e+04  7.586e+05
> MPI Message Lengths:  1.735e+09      1.09967   1.685e+04  1.278e+10
> MPI Reductions:       4.781e+04      1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N
> -->
> 2N flops
>                             and VecAXPY() for complex vectors of length N
> -->
> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---
> -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total
> Avg         %Total   counts   %Total
>  0:      Main Stage: 1.3416e-02   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%
> 0.000e+00        0.0%  0.000e+00   0.0%
>  1:       StepStage: 8.7909e-01   0.5%  1.8119e+09   0.7%  0.000e+00   0.0%
> 0.000e+00        0.0%  0.000e+00   0.0%
>  2:       ConvStage: 1.7172e+01   9.1%  9.2610e+09   3.7%  1.843e+05  24.3%
> 3.981e+03       23.6%  0.000e+00   0.0%
>  3:       ProjStage: 1.6804e+02  88.6%  2.3813e+11  95.2%  5.703e+05  75.2%
> 1.232e+04       73.1%  4.627e+04  96.8%
>  4:         IoStage: 1.5814e+00   0.8%  0.0000e+00   0.0%  1.420e+03   0.2%
> 4.993e+02        3.0%  2.500e+02   0.5%
>  5:       SolvAlloc: 2.5722e-01   0.1%  0.0000e+00   0.0%  2.560e+02   0.0%
> 1.054e+00        0.0%  3.330e+02   0.7%
>  6:       SolvSolve: 1.6776e+00   0.9%  9.5345e+08   0.4%  2.280e+03   0.3%
> 4.924e+01        0.3%  9.540e+02   2.0%
>  7:       SolvDeall: 7.4017e-04   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%
> 0.000e+00        0.0%  0.000e+00   0.0%
>
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting
> output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>       %T - percent time in this phase         %f - percent flops in this
> phase
>       %M - percent messages in this phase     %L - percent message lengths
> in
> this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all
> processors)
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
> --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
>
> --- Event Stage 1: StepStage
>
> VecAXPY             3072 1.0 8.8295e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0  99100  0  0  0  2052
>
> --- Event Stage 2: ConvStage
>
> VecCopy             4608 1.0 1.6016e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   9  0  0  0  0     0
> VecAXPY             4608 1.0 1.2212e+00 1.2 3.02e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  1  0  0  0   6 26  0  0  0  1978
> VecAXPBYCZ          5376 1.0 2.5875e+00 1.1 7.05e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  2  0  0  0  15 61  0  0  0  2179
> VecPointwiseMult    4608 1.0 1.4411e+00 1.0 1.51e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   8 13  0  0  0   838
> VecScatterBegin     7680 1.0 3.4130e+00 1.0 0.00e+00 0.0 1.8e+05 1.6e+04
> 0.0e+00  2  0 24 24  0  20  0100100  0     0
> VecScatterEnd       7680 1.0 9.3412e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   5  0  0  0  0     0
>
> --- Event Stage 3: ProjStage
>
> VecMDot             2560 1.0 2.1944e+00 1.1 9.23e+08 1.0 0.0e+00 0.0e+00
> 2.6e+03  1  3  0  0  5   1  3  0  0  6  3364
> VecTDot            19924 1.0 2.7283e+00 1.3 1.31e+09 1.0 0.0e+00 0.0e+00
> 2.0e+04  1  4  0  0 42   1  4  0  0 43  3829
> VecNorm            13034 1.0 1.5385e+00 2.0 8.54e+08 1.0 0.0e+00 0.0e+00
> 1.3e+04  1  3  0  0 27   1  3  0  0 28  4442
> VecScale           13034 1.0 9.0783e-01 1.3 4.27e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0  3764
> VecCopy            21972 1.0 3.5136e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
> VecSet             21460 1.0 1.3108e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecAXPY            41384 1.0 5.9866e+00 1.1 2.71e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  3  9  0  0  0   3  9  0  0  0  3624
> VecAYPX            30142 1.0 5.3362e+00 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  3  5  0  0  0   3  6  0  0  0  2460
> VecMAXPY            2816 1.0 1.8561e+00 1.0 1.09e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  3  0  0  0   1  4  0  0  0  4700
> VecScatterBegin    23764 1.0 1.7138e+00 1.1 0.00e+00 0.0 5.7e+05 1.6e+04
> 0.0e+00  1  0 75 73  0   1  0100100  0     0
> VecScatterEnd      23764 1.0 3.1986e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecNormalize        2816 1.0 2.9511e-01 1.1 2.77e+08 1.0 0.0e+00 0.0e+00
> 2.8e+03  0  1  0  0  6   0  1  0  0  6  7504
> MatMult            22740 1.0 4.6896e+01 1.0 1.04e+10 1.0 5.5e+05 1.6e+04
> 0.0e+00 25 33 72 70  0  28 35 96 96  0  1780
> MatSOR             23252 1.0 9.5250e+01 1.0 1.04e+10 1.0 0.0e+00 0.0e+00
> 0.0e+00 50 33  0  0  0  56 35  0  0  0   872
> KSPGMRESOrthog      2560 1.0 3.6142e+00 1.1 1.85e+09 1.0 0.0e+00 0.0e+00
> 2.6e+03  2  6  0  0  5   2  6  0  0  6  4085
> KSPSetUp             768 1.0 7.9389e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 5.6e+03  0  0  0  0 12   0  0  0  0 12     0
> KSPSolve             256 1.0 1.6661e+02 1.0 2.97e+10 1.0 5.5e+05 1.6e+04
> 4.6e+04 88 95 72 70 97  99100 96 96100  1427
> PCSetUp              256 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.5e+03  0  0  0  0  3   0  0  0  0  3     0
> PCApply            10218 1.0 1.3642e+02 1.0 2.12e+10 1.0 3.1e+05 1.6e+04
> 1.3e+04 72 68 40 39 27  81 71 54 54 28  1245
>
> --- Event Stage 4: IoStage
>
> VecView               50 1.0 8.8377e-0138.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+02  0  0  0  0  0  29  0  0  0 40     0
> VecCopy               50 1.0 8.9977e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecScatterBegin       30 1.0 1.0644e-02 1.6 0.00e+00 0.0 7.2e+02 1.6e+04
> 0.0e+00  0  0  0  0  0   1  0 51  3  0     0
> VecScatterEnd         30 1.0 2.4857e-01109.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   8  0  0  0  0     0
>
> --- Event Stage 5: SolvAlloc
>
> VecSet                50 1.0 1.9324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   7  0  0  0  0     0
> MatAssemblyBegin       4 1.0 5.0378e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00  0  0  0  0  0   1  0  0  0  2     0
> MatAssemblyEnd         4 1.0 1.5030e-02 1.0 0.00e+00 0.0 9.6e+01 4.1e+03
> 1.6e+01  0  0  0  0  0   6  0 38 49  5     0
>
> --- Event Stage 6: SolvSolve
>
> VecMDot               10 1.0 8.9154e-03 1.1 3.60e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  3  0  0  1  3234
> VecTDot               80 1.0 1.1104e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00
> 8.0e+01  0  0  0  0  0   1  4  0  0  8  3777
> VecNorm              820 1.0 2.6904e-01 1.6 3.41e+06 1.0 0.0e+00 0.0e+00
> 8.2e+02  0  0  0  0  2  13  3  0  0 86   101
> VecScale              52 1.0 3.6066e-03 1.2 1.70e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  1  0  0  0  3780
> VecCopy               91 1.0 1.4363e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecSet                86 1.0 5.1112e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              169 1.0 2.4659e-02 1.1 1.11e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  9  0  0  0  3593
> VecAYPX              121 1.0 2.2017e-02 1.1 6.59e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  6  0  0  0  2393
> VecMAXPY              11 1.0 7.2782e-03 1.0 4.26e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  4  0  0  0  4682
> VecScatterBegin       95 1.0 7.3617e-03 1.1 0.00e+00 0.0 2.3e+03 1.6e+04
> 0.0e+00  0  0  0  0  0   0  0100100  0     0
> VecScatterEnd         95 1.0 1.3788e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecNormalize          11 1.0 1.2109e-03 1.1 1.08e+06 1.0 0.0e+00 0.0e+00
> 1.1e+01  0  0  0  0  0   0  1  0  0  1  7144
> MatMult               91 1.0 1.9398e-01 1.0 4.17e+07 1.0 2.2e+03 1.6e+04
> 0.0e+00  0  0  0  0  0  11 35 96 96  0  1722
> MatSOR                93 1.0 3.8194e-01 1.0 4.16e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  23 35  0  0  0   870
> KSPGMRESOrthog        10 1.0 1.4540e-02 1.1 7.21e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   1  6  0  0  1  3966
> KSPSetUp               3 1.0 5.2021e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01  0  0  0  0  0   0  0  0  0  3     0
> KSPSolve               1 1.0 6.7911e-01 1.0 1.19e+08 1.0 2.2e+03 1.6e+04
> 1.9e+02  0  0  0  0  0  40100 96 96 19  1399
> PCSetUp                1 1.0 1.9128e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00  0  0  0  0  0   0  0  0  0  1     0
> PCApply               41 1.0 5.5355e-01 1.0 8.47e+07 1.0 1.2e+03 1.6e+04
> 5.1e+01  0  0  0  0  0  33 71 54 54  5  1224
>
> --- Event Stage 7: SolvDeall
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
>               Viewer     1              0            0     0
>
> --- Event Stage 1: StepStage
>
>
> --- Event Stage 2: ConvStage
>
>
> --- Event Stage 3: ProjStage
>
>               Vector  5376           5376   1417328640     0
>        Krylov Solver   768            768      8298496     0
>       Preconditioner   768            768       645120     0
>
> --- Event Stage 4: IoStage
>
>               Vector    50             50     13182000     0
>               Viewer    50             50        34400     0
>
> --- Event Stage 5: SolvAlloc
>
>               Vector   140              6         8848     0
>       Vector Scatter     6              0            0     0
>               Matrix     6              0            0     0
>     Distributed Mesh     2              0            0     0
>      Bipartite Graph     4              0            0     0
>            Index Set    14             14       372400     0
>    IS L to G Mapping     3              0            0     0
>        Krylov Solver     2              0            0     0
>       Preconditioner     2              0            0     0
>
> --- Event Stage 6: SolvSolve
>
>               Vector    22              0            0     0
>        Krylov Solver     3              2         2296     0
>       Preconditioner     3              2         1760     0
>
> --- Event Stage 7: SolvDeall
>
>               Vector     0            149     41419384     0
>       Vector Scatter     0              1         1036     0
>               Matrix     0              3      4619676     0
>        Krylov Solver     0              3        32416     0
>       Preconditioner     0              3         2520     0
>
> ========================================================================================================================
> Average time to get PetscTime(): 1.90735e-07
> Average time for MPI_Barrier(): 4.62532e-06
> Average time for zero size MPI_Send(): 1.51992e-06
> #PETSc Option Table entries:
> -ksp_type cg
> -log_summary
> -pc_mg_galerkin
> -pc_type mg
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at:
> Configure options:
>
> --> And with 128^3 (512 iterations):
>
> Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012
>
>                          Max       Max/Min        Avg      Total
> Time (sec):           5.889e+02      1.00000   5.889e+02
> Objects:              1.413e+04      1.00000   1.413e+04
> Flops:                9.486e+10      1.00000   9.486e+10  6.071e+12
> Flops/sec:            1.611e+08      1.00000   1.611e+08  1.031e+10
> MPI Messages:         5.392e+05      1.00578   5.361e+05  3.431e+07
> MPI Message Lengths:  6.042e+09      1.36798   8.286e+03  2.843e+11
> MPI Reductions:       1.343e+05      1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N
> -->
> 2N flops
>                             and VecAXPY() for complex vectors of length N
> -->
> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---
> -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total
> Avg         %Total   counts   %Total
>  0:      Main Stage: 1.1330e-01   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%
> 0.000e+00        0.0%  0.000e+00   0.0%
>  1:       StepStage: 1.7508e+00   0.3%  2.8991e+10   0.5%  0.000e+00   0.0%
> 0.000e+00        0.0%  0.000e+00   0.0%
>  2:       ConvStage: 3.5534e+01   6.0%  1.4818e+11   2.4%  5.898e+06  17.2%
> 1.408e+03       17.0%  0.000e+00   0.0%
>  3:       ProjStage: 5.3568e+02  91.0%  5.8820e+12  96.9%  2.833e+07  82.6%
> 6.765e+03       81.6%  1.319e+05  98.2%
>  4:         IoStage: 1.1365e+01   1.9%  0.0000e+00   0.0%  1.782e+04   0.1%
> 9.901e+01        1.2%  2.500e+02   0.2%
>  5:       SolvAlloc: 7.1497e-01   0.1%  0.0000e+00   0.0%  5.632e+03   0.0%
> 1.866e-01        0.0%  3.330e+02   0.2%
>  6:       SolvSolve: 3.7604e+00   0.6%  1.1888e+10   0.2%  5.722e+04   0.2%
> 1.366e+01        0.2%  1.803e+03   1.3%
>  7:       SolvDeall: 7.6677e-04   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%
> 0.000e+00        0.0%  0.000e+00   0.0%
>
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting
> output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>       %T - percent time in this phase         %f - percent flops in this
> phase
>       %M - percent messages in this phase     %L - percent message lengths
> in
> this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all
> processors)
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
> --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
>
> --- Event Stage 1: StepStage
>
> VecAXPY             6144 1.0 1.8187e+00 1.1 4.53e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  99100  0  0  0 15941
>
> --- Event Stage 2: ConvStage
>
> VecCopy             9216 1.0 3.2440e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   9  0  0  0  0     0
> VecAXPY             9216 1.0 2.4045e+00 1.1 6.04e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   6 26  0  0  0 16076
> VecAXPBYCZ         10752 1.0 5.1656e+00 1.1 1.41e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  1  0  0  0  14 61  0  0  0 17460
> VecPointwiseMult    9216 1.0 2.9012e+00 1.0 3.02e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   8 13  0  0  0  6662
> VecScatterBegin    15360 1.0 7.3895e+00 1.3 0.00e+00 0.0 5.9e+06 8.2e+03
> 0.0e+00  1  0 17 17  0  18  0100100  0     0
> VecScatterEnd      15360 1.0 4.4483e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0  10  0  0  0  0     0
>
> --- Event Stage 3: ProjStage
>
> VecMDot             5120 1.0 5.2159e+00 1.2 1.85e+09 1.0 0.0e+00 0.0e+00
> 5.1e+03  1  2  0  0  4   1  2  0  0  4 22644
> VecTDot            66106 1.0 1.3662e+01 1.4 4.33e+09 1.0 0.0e+00 0.0e+00
> 6.6e+04  2  5  0  0 49   2  5  0  0 50 20295
> VecNorm            39197 1.0 1.4431e+01 2.8 2.57e+09 1.0 0.0e+00 0.0e+00
> 3.9e+04  2  3  0  0 29   2  3  0  0 30 11392
> VecScale           39197 1.0 2.8002e+00 1.2 1.28e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 29356
> VecCopy            70202 1.0 1.1299e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
> VecSet             69178 1.0 3.9612e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecAXPY           135284 1.0 1.9286e+01 1.1 8.87e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  3  9  0  0  0   3 10  0  0  0 29422
> VecAYPX            99671 1.0 1.7862e+01 1.1 5.43e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  3  6  0  0  0   3  6  0  0  0 19464
> VecMAXPY            5632 1.0 3.7555e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  2  0  0  0   1  2  0  0  0 37169
> VecScatterBegin    73786 1.0 6.2463e+00 1.2 0.00e+00 0.0 2.8e+07 8.2e+03
> 0.0e+00  1  0 83 82  0   1  0100100  0     0
> VecScatterEnd      73786 1.0 2.1679e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
> VecNormalize        5632 1.0 9.0864e-01 1.2 5.54e+08 1.0 0.0e+00 0.0e+00
> 5.6e+03  0  1  0  0  4   0  1  0  0  4 38996
> MatMult            71738 1.0 1.5645e+02 1.1 3.29e+10 1.0 2.8e+07 8.2e+03
> 0.0e+00 26 35 80 79  0  28 36 97 97  0 13462
> MatSOR             72762 1.0 2.9900e+02 1.0 3.25e+10 1.0 0.0e+00 0.0e+00
> 0.0e+00 49 34  0  0  0  54 35  0  0  0  6953
> KSPGMRESOrthog      5120 1.0 8.0849e+00 1.1 3.69e+09 1.0 0.0e+00 0.0e+00
> 5.1e+03  1  4  0  0  4   1  4  0  0  4 29218
> KSPSetUp            1536 1.0 2.0613e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.1e+04  0  0  0  0  8   0  0  0  0  9     0
> KSPSolve             512 1.0 5.3248e+02 1.0 9.18e+10 1.0 2.8e+07 8.2e+03
> 1.3e+05 90 97 80 79 98  99100 97 97100 11034
> PCSetUp              512 1.0 5.6760e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 3.1e+03  0  0  0  0  2   0  0  0  0  2     0
> PCApply            33565 1.0 4.2495e+02 1.0 6.36e+10 1.0 1.5e+07 8.2e+03
> 2.6e+04 71 67 43 43 19  78 69 52 52 20  9585
>
> --- Event Stage 4: IoStage
>
> VecView               50 1.0 7.7463e+00240.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+02  1  0  0  0  0  34  0  0  0 40     0
> VecCopy               50 1.0 1.0773e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin       30 1.0 1.1727e-02 2.3 0.00e+00 0.0 1.2e+04 8.2e+03
> 0.0e+00  0  0  0  0  0   0  0 65  3  0     0
> VecScatterEnd         30 1.0 2.2058e+00701.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  10  0  0  0  0     0
>
> --- Event Stage 5: SolvAlloc
>
> VecSet                50 1.0 1.3748e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  14  0  0  0  0     0
> MatAssemblyBegin       4 1.0 3.1760e-0217.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00  0  0  0  0  0   2  0  0  0  2     0
> MatAssemblyEnd         4 1.0 2.1847e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03
> 1.6e+01  0  0  0  0  0   3  0 27 49  5     0
>
> --- Event Stage 6: SolvSolve
>
> VecMDot               10 1.0 1.2067e-02 1.5 3.60e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  2  0  0  1 19117
> VecTDot              134 1.0 2.6145e-02 1.5 8.78e+06 1.0 0.0e+00 0.0e+00
> 1.3e+02  0  0  0  0  0   1  5  0  0  7 21497
> VecNorm             1615 1.0 1.4866e+00 3.5 5.18e+06 1.0 0.0e+00 0.0e+00
> 1.6e+03  0  0  0  0  1  29  3  0  0 90   223
> VecScale              79 1.0 5.9721e-03 1.2 2.59e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  1  0  0  0 27741
> VecCopy              145 1.0 2.4912e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecSet               140 1.0 7.9901e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              277 1.0 4.0597e-02 1.2 1.82e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1 10  0  0  0 28619
> VecAYPX              202 1.0 3.5421e-02 1.1 1.10e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  6  0  0  0 19893
> VecMAXPY              11 1.0 7.7360e-03 1.1 4.26e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  2  0  0  0 35242
> VecScatterBegin      149 1.0 1.4983e-02 1.2 0.00e+00 0.0 5.7e+04 8.2e+03
> 0.0e+00  0  0  0  0  0   0  0100100  0     0
> VecScatterEnd        149 1.0 5.0236e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecNormalize          11 1.0 7.1080e-03 3.9 1.08e+06 1.0 0.0e+00 0.0e+00
> 1.1e+01  0  0  0  0  0   0  1  0  0  1  9736
> MatMult              145 1.0 3.2611e-01 1.1 6.65e+07 1.0 5.6e+04 8.2e+03
> 0.0e+00  0  0  0  0  0   8 36 97 97  0 13055
> MatSOR               147 1.0 6.0702e-01 1.0 6.57e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0  16 35  0  0  0  6923
> KSPGMRESOrthog        10 1.0 1.7956e-02 1.3 7.21e+06 1.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  4  0  0  1 25694
> KSPSetUp               3 1.0 3.0483e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01  0  0  0  0  0   1  0  0  0  1     0
> KSPSolve               1 1.0 1.1431e+00 1.0 1.85e+08 1.0 5.6e+04 8.2e+03
> 2.7e+02  0  0  0  0  0  30100 97 97 15 10378
> PCSetUp                1 1.0 1.1488e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCApply               68 1.0 9.1644e-01 1.0 1.28e+08 1.0 3.0e+04 8.2e+03
> 5.1e+01  0  0  0  0  0  24 69 52 52  3  8959
>
> --- Event Stage 7: SolvDeall
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
>               Viewer     1              0            0     0
>
> --- Event Stage 1: StepStage
>
>
> --- Event Stage 2: ConvStage
>
>
> --- Event Stage 3: ProjStage
>
>               Vector 10752          10752   2834657280     0
>        Krylov Solver  1536           1536     16596992     0
>       Preconditioner  1536           1536      1290240     0
>
> --- Event Stage 4: IoStage
>
>               Vector    50             50     13182000     0
>               Viewer    50             50        34400     0
>
> --- Event Stage 5: SolvAlloc
>
>               Vector   140              6         8848     0
>       Vector Scatter     6              0            0     0
>               Matrix     6              0            0     0
>     Distributed Mesh     2              0            0     0
>      Bipartite Graph     4              0            0     0
>            Index Set    14             14       372400     0
>    IS L to G Mapping     3              0            0     0
>        Krylov Solver     2              0            0     0
>       Preconditioner     2              0            0     0
>
> --- Event Stage 6: SolvSolve
>
>               Vector    22              0            0     0
>        Krylov Solver     3              2         2296     0
>       Preconditioner     3              2         1760     0
>
> --- Event Stage 7: SolvDeall
>
>               Vector     0            149     41419384     0
>       Vector Scatter     0              1         1036     0
>               Matrix     0              3      4619676     0
>        Krylov Solver     0              3        32416     0
>       Preconditioner     0              3         2520     0
>
> ========================================================================================================================
> Average time to get PetscTime(): 9.53674e-08
> Average time for MPI_Barrier(): 1.13964e-05
> Average time for zero size MPI_Send(): 1.2815e-06
> #PETSc Option Table entries:
> -ksp_type cg
> -log_summary
> -pc_mg_galerkin
> -pc_type mg
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at:
> Configure options:
>
> Best,
> Filippo
>
> On Monday 29 September 2014 08:58:35 Matthew Knepley wrote:
> > On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi <
> >
> > filippo.leonardi at sam.math.ethz.ch> wrote:
> > > Hi,
> > >
> > > I am trying to solve a standard second order central differenced
> Poisson
> > > equation in parallel, in 3D, using a 3D structured DMDAs (extremely
> > > standard
> > > Laplacian matrix).
> > >
> > > I want to get some nice scaling (especially weak), but my results show
> > > that
> > > the Krylow method is not performing as expected. The problem (at leas
> for
> > > CG +
> > > Bjacobi)  seems to lie on the number of iterations.
> > >
> > > In particular the number of iterations grows with CG  (the matrix is
> SPD)
> > > +
> > > BJacobi as mesh is refined (probably due to condition number
> increasing)
> > > and
> > > number of processors is increased (probably due to the Bjacobi
> > > preconditioner). For instance I tried the following setup:
> > > 1 procs to solve 32^3 domain => 20 iterations
> > > 8 procs to solve 64^3 domain => 60 iterations
> > > 64  procs to solve 128^3 domain => 101 iterations
> > >
> > > Is there something pathological with my runs (maybe I am missing
> > > something)?
> > > Is there somebody who can provide me weak scaling benchmarks for
> > > equivalent
> > > problems? (Maybe there is some better preconditioner for this problem).
> >
> > Bjacobi is not a scalable preconditioner. As you note, the number of
> > iterates grows
> > with the system size. You should always use MG here.
> >
> > > I am also aware that Multigrid is even better for this problems but the
> > > **scalability** of my runs seems to be as bad as with CG.
> >
> > MG will weak scale almost perfectly. Send -log_summary for each run if
> this
> > does not happen.
> >
> >   Thanks,
> >
> >      Matt
> >
> > > -pc_mg_galerkin
> > > -pc_type mg
> > > (both directly with richardson or as preconditioner to cg)
> > >
> > > The following is the "-log_summary" of a 128^3 run, notice that I solve
> > > the
> > > system multiple times (hence KSPSolve is multiplied by 128). Using CG +
> > > BJacobi.
> > >
> > > Tell me if I missed some detail and sorry for the length of the post.
> > >
> > > Thanks,
> > > Filippo
> > >
> > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT
> 2012
> > >
> > >                          Max       Max/Min        Avg      Total
> > >
> > > Time (sec):           9.095e+01      1.00001   9.095e+01
> > > Objects:              1.875e+03      1.00000   1.875e+03
> > > Flops:                1.733e+10      1.00000   1.733e+10  1.109e+12
> > > Flops/sec:            1.905e+08      1.00001   1.905e+08  1.219e+10
> > > MPI Messages:         1.050e+05      1.00594   1.044e+05  6.679e+06
> > > MPI Message Lengths:  1.184e+09      1.37826   8.283e+03  5.532e+10
> > > MPI Reductions:       4.136e+04      1.00000
> > >
> > > Flop counting convention: 1 flop = 1 real number operation of type
> > > (multiply/divide/add/subtract)
> > >
> > >                             e.g., VecAXPY() for real vectors of length
> N
> > >
> > > -->
> > > 2N flops
> > >
> > >                             and VecAXPY() for complex vectors of
> length N
> > >
> > > -->
> > > 8N flops
> > >
> > > Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> > > ---
> > > -- Message Lengths --  -- Reductions --
> > >
> > >                         Avg     %Total     Avg     %Total   counts
> > >                         %Total
> > >
> > > Avg         %Total   counts   %Total
> > >
> > >  0:      Main Stage: 1.1468e-01   0.1%  0.0000e+00   0.0%  0.000e+00
> > >  0.0%
> > >
> > > 0.000e+00        0.0%  0.000e+00   0.0%
> > >
> > >  1:       StepStage: 4.4170e-01   0.5%  7.2478e+09   0.7%  0.000e+00
> > >  0.0%
> > >
> > > 0.000e+00        0.0%  0.000e+00   0.0%
> > >
> > >  2:       ConvStage: 8.8333e+00   9.7%  3.7044e+10   3.3%  1.475e+06
> > >  22.1%
> > >
> > > 1.809e+03       21.8%  0.000e+00   0.0%
> > >
> > >  3:       ProjStage: 7.7169e+01  84.8%  1.0556e+12  95.2%  5.151e+06
> > >  77.1%
> > >
> > > 6.317e+03       76.3%  4.024e+04  97.3%
> > >
> > >  4:         IoStage: 2.4789e+00   2.7%  0.0000e+00   0.0%  3.564e+03
> > >  0.1%
> > >
> > > 1.017e+02        1.2%  5.000e+01   0.1%
> > >
> > >  5:       SolvAlloc: 7.0947e-01   0.8%  0.0000e+00   0.0%  5.632e+03
> > >  0.1%
> > >
> > > 9.587e-01        0.0%  3.330e+02   0.8%
> > >
> > >  6:       SolvSolve: 1.2044e+00   1.3%  9.1679e+09   0.8%  4.454e+04
> > >  0.7%
> > >
> > > 5.464e+01        0.7%  7.320e+02   1.8%
> > >
> > >  7:       SolvDeall: 7.5711e-04   0.0%  0.0000e+00   0.0%  0.000e+00
> > >  0.0%
> > >
> > > 0.000e+00        0.0%  0.000e+00   0.0%
> > >
> > >
> > >
> --------------------------------------------------------------------------
> > > ---------------------------------------------- See the 'Profiling'
> chapter
> > > of the users' manual for details on
> > > interpreting
> > > output.
> > >
> > > Phase summary info:
> > >    Count: number of times phase was executed
> > >    Time and Flops: Max - maximum over all processors
> > >
> > >                    Ratio - ratio of maximum to minimum over all
> processors
> > >
> > >    Mess: number of messages sent
> > >    Avg. len: average message length
> > >    Reduct: number of global reductions
> > >    Global: entire computation
> > >    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and
> > >
> > > PetscLogStagePop().
> > >
> > >       %T - percent time in this phase         %f - percent flops in
> this
> > >
> > > phase
> > >
> > >       %M - percent messages in this phase     %L - percent message
> lengths
> > >
> > > in
> > > this phase
> > >
> > >       %R - percent reductions in this phase
> > >
> > >    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > >
> > > over all
> > > processors)
> > >
> > >
> --------------------------------------------------------------------------
> > > ---------------------------------------------- Event
> Count
> > >     Time (sec)     Flops
> > > --- Global ---  --- Stage ---   Total
> > >
> > >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> len
> > >
> > > Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
> > >
> > >
> --------------------------------------------------------------------------
> > > ----------------------------------------------
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > >
> > > --- Event Stage 1: StepStage
> > >
> > > VecAXPY             1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  1  0  0  0  99100  0  0  0 15608
> > >
> > > --- Event Stage 2: ConvStage
> > >
> > > VecCopy             2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  0  0  0  0   9  0  0  0  0     0
> > > VecAXPY             2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  1  0  0  0   6 26  0  0  0 15758
> > > VecAXPBYCZ          2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  2  0  0  0  14 61  0  0  0 17306
> > > VecPointwiseMult    2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  0  0  0  0   8 13  0  0  0  6677
> > > VecScatterBegin     3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06
> 8.2e+03
> > > 0.0e+00  2  0 22 22  0  18  0100100  0     0
> > > VecScatterEnd       3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  0  0  0  0  10  0  0  0  0     0
> > >
> > > --- Event Stage 3: ProjStage
> > >
> > > VecTDot            25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00
> 0.0e+00
> > > 2.6e+04  4 10  0  0 62   5 10  0  0 64 25433
> > > VecNorm            13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00
> 0.0e+00
> > > 1.3e+04  2  5  0  0 32   2  5  0  0 32 17759
> > > VecCopy              640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecSet             13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> > > VecAXPY            26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  4 10  0  0  0   5 10  0  0  0 26490
> > > VecAYPX            12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  2  5  0  0  0   2  5  0  0  0 27997
> > > VecScatterBegin    13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06
> 8.2e+03
> > > 0.0e+00  1  0 77 76  0   1  0100100  0     0
> > > VecScatterEnd      13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
> > > MatMult            12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06
> 8.2e+03
> > > 0.0e+00 35 34 74 73  0  41 36 96 96  0 11810
> > > MatSolve           13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 33 31  0  0  0  39 33  0  0  0 11182
> > > MatLUFactorNum       128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  1  1  0  0  0   2  1  0  0  0  4358
> > > MatILUFactorSym      128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 1.3e+02  1  0  0  0  0   1  0  0  0  0     0
> > > MatGetRowIJ          128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > MatGetOrdering       128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 2.6e+02  0  0  0  0  1   0  0  0  0  1     0
> > > KSPSetUp             256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 7.7e+02  0  0  0  0  2   0  0  0  0  2     0
> > > KSPSolve             128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06
> 8.2e+03
> > > 4.0e+04 84 95 74 73 97  99100 96 96100 13800
> > > PCSetUp              256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00
> 0.0e+00
> > > 6.4e+02  2  1  0  0  2   3  1  0  0  2  2619
> > > PCSetUpOnBlocks      128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00
> 0.0e+00
> > > 3.8e+02  2  1  0  0  1   3  1  0  0  1  2652
> > > PCApply            13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00 34 31  0  0  0  40 33  0  0  0 10844
> > >
> > > --- Event Stage 4: IoStage
> > >
> > > VecView               10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 2.0e+01  1  0  0  0  0  36  0  0  0 40     0
> > > VecCopy               10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecScatterBegin        6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03
> 8.2e+03
> > > 0.0e+00  0  0  0  0  0   0  0 65  3  0     0
> > > VecScatterEnd          6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   9  0  0  0  0     0
> > >
> > > --- Event Stage 5: SolvAlloc
> > >
> > > VecSet                50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0  13  0  0  0  0     0
> > > MatAssemblyBegin       4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 8.0e+00  0  0  0  0  0   3  0  0  0  2     0
> > > MatAssemblyEnd         4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03
> 2.0e+03
> > > 1.6e+01  0  0  0  0  0   3  0 27 49  5     0
> > >
> > > --- Event Stage 6: SolvSolve
> > >
> > > VecTDot              224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00
> 0.0e+00
> > > 2.2e+02  0  0  0  0  1   3 10  0  0 31 26499
> > > VecNorm              497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00
> 0.0e+00
> > > 5.0e+02  0  0  0  0  1  11  5  0  0 68  3104
> > > VecCopy                8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecSet               114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecAXPY              230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   3 11  0  0  0 25934
> > > VecAYPX              111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   1  5  0  0  0 27142
> > > VecScatterBegin      116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04
> 8.2e+03
> > > 0.0e+00  0  0  1  1  0   1  0100100  0     0
> > > VecScatterEnd        116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
> > > MatMult              112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04
> 8.2e+03
> > > 0.0e+00  0  0  1  1  0  23 36 97 97  0 11711
> > > MatSolve             113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0  22 33  0  0  0 11217
> > > MatLUFactorNum         1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   1  0  0  0  0  4259
> > > MatILUFactorSym        1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 1.0e+00  0  0  0  0  0   2  0  0  0  0     0
> > > MatGetRowIJ            1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > MatGetOrdering         1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > KSPSetUp               2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00
> 0.0e+00
> > > 6.0e+00  0  0  0  0  0   1  0  0  0  1     0
> > > KSPSolve               1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04
> 8.2e+03
> > > 3.5e+02  1  1  1  1  1  58100 97 97 48 13069
> > > PCSetUp                2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00
> 0.0e+00
> > > 5.0e+00  0  0  0  0  0   3  0  0  0  1   996
> > > PCSetUpOnBlocks        1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00
> 0.0e+00
> > > 3.0e+00  0  0  0  0  0   3  0  0  0  0  1002
> > > PCApply              113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00
> 0.0e+00
> > > 0.0e+00  0  0  0  0  0  22 33  0  0  0 10877
> > >
> > > --- Event Stage 7: SolvDeall
> > >
> > >
> > >
> --------------------------------------------------------------------------
> > > ----------------------------------------------
> > >
> > > Memory usage is given in bytes:
> > >
> > > Object Type          Creations   Destructions     Memory  Descendants'
> > > Mem.
> > > Reports information only for process 0.
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > >               Viewer     1              0            0     0
> > >
> > > --- Event Stage 1: StepStage
> > >
> > >
> > > --- Event Stage 2: ConvStage
> > >
> > >
> > > --- Event Stage 3: ProjStage
> > >
> > >               Vector   640            640    101604352     0
> > >               Matrix   128            128    410327040     0
> > >
> > >            Index Set   384            384     17062912     0
> > >
> > >        Krylov Solver   256            256       282624     0
> > >
> > >       Preconditioner   256            256       228352     0
> > >
> > > --- Event Stage 4: IoStage
> > >
> > >               Vector    10             10      2636400     0
> > >               Viewer    10             10         6880     0
> > >
> > > --- Event Stage 5: SolvAlloc
> > >
> > >               Vector   140              6         8848     0
> > >
> > >       Vector Scatter     6              0            0     0
> > >
> > >               Matrix     6              0            0     0
> > >
> > >     Distributed Mesh     2              0            0     0
> > >
> > >      Bipartite Graph     4              0            0     0
> > >
> > >            Index Set    14             14       372400     0
> > >
> > >    IS L to G Mapping     3              0            0     0
> > >
> > >        Krylov Solver     1              0            0     0
> > >
> > >       Preconditioner     1              0            0     0
> > >
> > > --- Event Stage 6: SolvSolve
> > >
> > >               Vector     5              0            0     0
> > >               Matrix     1              0            0     0
> > >
> > >            Index Set     3              0            0     0
> > >
> > >        Krylov Solver     2              1         1136     0
> > >
> > >       Preconditioner     2              1          824     0
> > >
> > > --- Event Stage 7: SolvDeall
> > >
> > >               Vector     0            133     36676728     0
> > >
> > >       Vector Scatter     0              1         1036     0
> > >
> > >               Matrix     0              4      7038924     0
> > >
> > >            Index Set     0              3       133304     0
> > >
> > >        Krylov Solver     0              2         2208     0
> > >
> > >       Preconditioner     0              2         1784     0
> > >
> > >
> ==========================================================================
> > > ============================================== Average time to get
> > > PetscTime(): 9.53674e-08
> > > Average time for MPI_Barrier(): 1.12057e-05
> > > Average time for zero size MPI_Send(): 1.3113e-06
> > > #PETSc Option Table entries:
> > > -ksp_type cg
> > > -log_summary
> > > -pc_type bjacobi
> > > #End of PETSc Option Table entries
> > > Compiled without FORTRAN kernels
> > > Compiled with full precision matrices (default)
> > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> > > Configure run at:
> > > Configure options:
> > > Application 9457215 resources: utime ~5920s, stime ~58s




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140929/291ac88a/attachment-0001.html>


More information about the petsc-users mailing list