[petsc-users] Enquiry regarding log summary results

TAY wee-beng zonexo at gmail.com
Wed Oct 3 08:08:46 CDT 2012


On 2/10/2012 2:43 PM, Jed Brown wrote:
> On Tue, Oct 2, 2012 at 8:35 AM, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     Hi,
>
>     I have combined the momentum linear eqns involving x,y,z into 1
>     large matrix. The Poisson eqn is solved using HYPRE strcut format
>     so it's not included. I run the code for 50 timesteps (hence 50
>     kspsolve) using 96 procs. The log_summary is given below. I have
>     some questions:
>
>     1. After combining the matrix, I should have only 1 PETSc matrix.
>     Why does it says there are 4 matrix, 12 vector etc?
>
>
> They are part of preconditioning. Are you sure you're using Hypre for 
> this? It looks like you are using bjacobi/ilu.
>
>
>     2. I'm looking at the stages which take the longest time. It seems
>     that MatAssemblyBegin, VecNorm, VecAssemblyBegin, VecScatterEnd
>     have very high ratios. The ratios of some others are also not too
>     good (~ 1.6 - 2). So are these stages the reason why my code is
>     not scaling well? What can I do to improve it?
>
>
> 3/4 of the solve time is evenly balanced between MatMult, MatSolve, 
> MatLUFactorNumeric, and VecNorm+VecDot.
>
> The high VecAssembly time might be due to generating a lot of entries 
> off-process?
>
> In any case, this looks like an _extremely_ slow network, perhaps it's 
> misconfigured?

My cluster is configured with 48 procs per node. I re-run the case, 
using only 48 procs, thus there's no need to pass over a 'slow' 
interconnect. I'm now also using GAMG and BCGS for the poisson and 
momentum eqn respectively. I have also separated the x,y,z component of 
the momentum eqn to 3 separate linear eqns to debug the problem.

Results show that stage "momentum_z" is taking a lot of time. I wonder 
if it has to do with the fact that I am partitioning my grids in the z 
direction. VecScatterEnd, MatMult are taking a lot of time. 
VecNormalize, VecScatterEnd, VecNorm, VecAssemblyBegin 's ratio are also 
not good.

I wonder why a lot of entries are generated off-process.

I create my RHS vector using:

/call 
VecCreateMPI(MPI_COMM_WORLD,ijk_xyz_end-ijk_xyz_sta,PETSC_DECIDE,b_rhs_semi_z,ierr)/

where ijk_xyz_sta and ijk_xyz_end are obtained from

/call MatGetOwnershipRange(A_semi_z,ijk_xyz_sta,ijk_xyz_end,ierr)/

I then insert the values into the vector using:

/call VecSetValues(b_rhs_semi_z , ijk_xyz_end - ijk_xyz_sta , 
(/ijk_xyz_sta : ijk_xyz_end - 1/) , q_semi_vect_z(ijk_xyz_sta + 1 : 
ijk_xyz_end) , INSERT_VALUES , ierr)/

What should I do to correct the problem?

Thanks

>
>     Btw, I insert matrix using:
>
>     /do ijk=ijk_xyz_sta+1,ijk_xyz_end//
>     //
>     //    II = ijk - 1//    !Fortran shift to 0-based//
>     ////
>     //    call
>     MatSetValues(A_semi_xyz,1,II,7,int_semi_xyz(ijk,1:7),semi_mat_xyz(ijk,1:7),INSERT_VALUES,ierr)//
>     //
>     //end do/
>
>     where ijk_xyz_sta/ijk_xyz_end are the starting/end index
>
>     int_semi_xyz(ijk,1:7) stores the 7 column global indices
>
>     semi_mat_xyz has the corresponding values.
>
>     and I insert vectors using:
>
>     call
>     VecSetValues(b_rhs_semi_xyz,ijk_xyz_end_mz-ijk_xyz_sta_mz,(/ijk_xyz_sta_mz:ijk_xyz_end_mz-1/),q_semi_vect_xyz(ijk_xyz_sta_mz+1:ijk_xyz_end_mz),INSERT_VALUES,ierr)
>
>     Thanks!
>
>     /
>     /
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>     On 30/9/2012 11:30 PM, Jed Brown wrote:
>>
>>     You can measure the time spent in Hypre via PCApply and PCSetUp,
>>     but you can't get finer grained integrated profiling because it
>>     was not set up that way.
>>
>>     On Sep 30, 2012 3:26 PM, "TAY wee-beng" <zonexo at gmail.com
>>     <mailto:zonexo at gmail.com>> wrote:
>>
>>         On 27/9/2012 1:44 PM, Matthew Knepley wrote:
>>>         On Thu, Sep 27, 2012 at 3:49 AM, TAY wee-beng
>>>         <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>>>
>>>             Hi,
>>>
>>>             I'm doing a log summary for my 3d cfd code. I have some
>>>             questions:
>>>
>>>             1. if I'm solving 3 linear equations using ksp, is the
>>>             result given in the log summary the total of the 3
>>>             linear eqns' performance? How can I get the performance
>>>             for each individual eqn?
>>>
>>>
>>>         Use logging stages:
>>>         http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogStagePush.html
>>>
>>>             2. If I run my code for 10 time steps, does the log
>>>             summary gives the total or avg performance/ratio?
>>>
>>>
>>>         Total.
>>>
>>>             3. Besides PETSc, I'm also using HYPRE's native
>>>             geometric MG (Struct) to solve my Cartesian's grid CFD
>>>             poisson eqn. Is there any way I can use PETSc's log
>>>             summary to get HYPRE's performance? If I use boomerAMG
>>>             thru PETSc, can I get its performance?
>>>
>>>
>>>         If you mean flops, only if you count them yourself and tell
>>>         PETSc using
>>>         http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogFlops.html
>>>
>>>         This is the disadvantage of using packages that do not
>>>         properly monitor things :)
>>>
>>>             Matt
>>         So u mean if I use boomerAMG thru PETSc, there is no proper
>>         way of evaluating its performance, beside using PetscLogFlops?
>>>
>>>
>>>             -- 
>>>             Yours sincerely,
>>>
>>>             TAY wee-beng
>>>
>>>
>>>
>>>
>>>         -- 
>>>         What most experimenters take for granted before they begin
>>>         their experiments is infinitely more interesting than any
>>>         results to which their experiments lead.
>>>         -- Norbert Wiener
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121003/1e4e10a1/attachment-0001.html>
-------------- next part --------------
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out on a petsc-3.3-dev_shared_rel named n12-09 with 48 processors, by wtay Wed Oct  3 14:07:35 2012
Using Petsc Development HG revision: 9883b54053eca13dd473a4711adfd309d1436b6e  HG Date: Sun Sep 30 22:42:36 2012 -0500

                         Max       Max/Min        Avg      Total
Time (sec):           6.005e+03      1.00489   5.990e+03
Objects:              5.680e+02      1.00000   5.680e+02
Flops:                5.608e+11      1.24568   4.595e+11  2.205e+13
Flops/sec:            9.338e+07      1.24320   7.671e+07  3.682e+09
MPI Messages:         1.768e+05      4.44284   1.320e+05  6.337e+06
MPI Message Lengths:  1.524e+10      2.00538   1.128e+05  7.146e+11
MPI Reductions:       1.352e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
 0:      Main Stage: 6.9025e+02  11.5%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  3.600e+01   0.3%
 1:         poisson: 1.5981e+02   2.7%  5.8320e+11   2.6%  1.749e+05   2.8%  3.211e+03        2.8%  1.435e+03  10.6%
 2:      momentum_x: 3.7257e+00   0.1%  7.1634e+09   0.0%  3.760e+02   0.0%  1.616e+01        0.0%  2.800e+01   0.2%
 3:      momentum_y: 4.5579e+00   0.1%  7.2165e+09   0.0%  3.760e+02   0.0%  1.638e+01        0.0%  2.800e+01   0.2%
 4:      momentum_z: 5.1317e+03  85.7%  2.1457e+13  97.3%  6.161e+06  97.2%  1.095e+05       97.1%  1.199e+04  88.7%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: poisson

MatMult              949 1.0 8.0050e+01 1.4 8.30e+09 1.3 1.1e+05 1.3e+05 0.0e+00  1  1  2  2  0  42 52 65 73  0  3820
MatMultAdd           140 1.0 4.1770e+00 2.0 5.68e+08 1.5 1.3e+04 2.8e+04 0.0e+00  0  0  0  0  0   2  3  7  2  0  4421
MatMultTranspose     140 1.0 9.3398e+00 4.6 5.68e+08 1.5 1.3e+04 2.8e+04 0.0e+00  0  0  0  0  0   3  3  7  2  0  1977
MatSolve              70 0.0 1.3373e-03 0.0 7.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   537
MatLUFactorSym         1 1.0 2.8896e-0419.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 3.9911e-0450.7 1.89e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   473
MatConvert             4 1.0 3.9349e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale              12 1.0 4.3059e-01 1.4 5.29e+07 1.4 5.0e+02 1.2e+05 0.0e+00  0  0  0  0  0   0  0  0  0  0  4320
MatAssemblyBegin      69 1.0 1.4660e+01 3.3 0.00e+00 0.0 1.3e+03 2.4e+04 7.4e+01  0  0  0  0  1   5  0  1  0  5     0
MatAssemblyEnd        69 1.0 3.5805e+00 1.1 0.00e+00 0.0 7.3e+03 1.7e+04 2.0e+02  0  0  0  0  1   2  0  4  1 14     0
MatGetRow        3004250 1.0 1.2984e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ            1 0.0 3.3140e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 0.0 1.2302e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.2e-02  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             4 1.0 1.1238e+00 1.1 0.00e+00 0.0 1.5e+04 4.2e+04 3.4e+02  0  0  0  0  2   1  0  8  3 23     0
MatAXPY                4 1.0 1.4109e-01 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult             4 1.0 3.1390e+00 1.0 3.67e+07 1.4 3.3e+03 5.7e+04 9.6e+01  0  0  0  0  1   2  0  2  1  7   424
MatMatMultSym          4 1.0 2.3711e+00 1.1 0.00e+00 0.0 2.8e+03 4.6e+04 8.8e+01  0  0  0  0  1   1  0  2  1  6     0
MatMatMultNum          4 1.0 7.8353e-01 1.0 3.67e+07 1.4 5.0e+02 1.2e+05 8.0e+00  0  0  0  0  0   0  0  0  0  1  1701
MatPtAP                4 1.0 1.0608e+01 1.0 1.89e+09 3.5 6.5e+03 1.2e+05 1.1e+02  0  0  0  0  1   7  5  4  4  8  2808
MatPtAPSymbolic        4 1.0 5.7953e+00 1.0 0.00e+00 0.0 5.9e+03 9.6e+04 1.0e+02  0  0  0  0  1   4  0  3  3  7     0
MatPtAPNumeric         4 1.0 4.8300e+00 1.0 1.89e+09 3.5 6.4e+02 3.1e+05 8.0e+00  0  0  0  0  0   3  5  0  1  1  6167
MatTrnMatMult          4 1.0 2.3590e+01 1.0 1.56e+09 4.5 2.7e+03 8.7e+05 1.2e+02  0  0  0  0  1  15 12  2 11  8  3021
MatGetLocalMat        20 1.0 1.2136e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  0   1  0  0  0  2     0
MatGetBrAoCol         12 1.0 5.6065e-01 2.7 0.00e+00 0.0 3.5e+03 1.8e+05 1.6e+01  0  0  0  0  0   0  0  2  3  1     0
MatGetSymTrans         8 1.0 1.6375e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog        75 1.0 2.5885e+00 2.0 3.30e+08 1.0 0.0e+00 0.0e+00 7.5e+01  0  0  0  0  1   1  3  0  0  5  5937
KSPSetUp              11 1.0 3.6090e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  1     0
KSPSolve               1 1.0 1.5654e+02 1.0 1.46e+10 1.3 1.7e+05 1.2e+05 1.4e+03  3  3  3  3 10  98100100100 99  3726
VecDot                34 1.0 7.1871e+00 6.1 9.03e+07 1.0 0.0e+00 0.0e+00 3.4e+01  0  0  0  0  0   3  1  0  0  2   602
VecDotNorm2           17 1.0 7.2364e+00 5.6 1.81e+08 1.0 0.0e+00 0.0e+00 5.1e+01  0  0  0  0  0   3  1  0  0  4  1196
VecMDot               75 1.0 2.2869e+00 3.8 1.65e+08 1.0 0.0e+00 0.0e+00 7.5e+01  0  0  0  0  1   1  1  0  0  5  3360
VecNorm              132 1.0 5.8965e+0011.2 8.09e+07 1.0 0.0e+00 0.0e+00 1.3e+02  0  0  0  0  1   2  1  0  0  9   649
VecScale             674 1.0 1.7632e+00 3.1 2.27e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  2  0  0  0  5983
VecCopy              181 1.0 6.9148e-01 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               673 1.0 7.6137e-01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1159 1.0 4.6118e+00 3.0 8.44e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  7  0  0  0  8513
VecAYPX             1120 1.0 6.7999e+00 3.0 5.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   3  4  0  0  0  3596
VecAXPBYCZ            34 1.0 2.2253e+00 2.7 1.81e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0  3890
VecWAXPY              34 1.0 2.3384e+00 3.0 9.03e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0  1851
VecMAXPY             114 1.0 8.3695e-01 2.4 1.95e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 10851
VecAssemblyBegin     110 1.0 4.8131e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+02  0  0  0  0  2   0  0  0  0 23     0
VecAssemblyEnd       110 1.0 3.8481e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     884 1.0 7.6038e+00 3.3 3.32e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   3  3  0  0  0  2030
VecScatterBegin     1345 1.0 8.7802e-01 4.0 0.00e+00 0.0 1.6e+05 1.1e+05 0.0e+00  0  0  2  2  0   0  0 90 83  0     0
VecScatterEnd       1345 1.0 5.5070e+0113.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0  16  0  0  0  0     0
VecSetRandom           4 1.0 1.0962e-01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize         114 1.0 4.4160e+0011.7 4.96e+07 1.0 0.0e+00 0.0e+00 1.1e+02  0  0  0  0  1   1  0  0  0  8   522
PCSetUp                2 1.0 5.6255e+01 1.0 3.09e+09 1.2 4.1e+04 1.3e+05 1.2e+03  1  1  1  1  9  35 23 23 27 84  2433
PCSetUpOnBlocks       35 1.0 8.9216e-04 5.9 1.89e+05 0.0 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0   0  0  0  0  0   212
PCApply               35 1.0 8.1400e+01 1.1 1.04e+10 1.3 1.3e+05 1.0e+05 1.2e+02  1  2  2  2  1  49 67 75 66  8  4792
PCGAMGgraph_AGG        4 1.0 6.8337e+00 1.0 3.67e+07 1.4 1.3e+03 7.0e+04 7.6e+01  0  0  0  0  1   4  0  1  0  5   195
PCGAMGcoarse_AGG       4 1.0 2.5930e+01 1.0 1.56e+09 4.5 2.0e+04 1.7e+05 5.3e+02  0  0  0  0  4  16 12 11 17 37  2748
PCGAMGProl_AGG         4 1.0 3.3635e+00 1.0 0.00e+00 0.0 3.4e+03 8.1e+04 1.1e+02  0  0  0  0  1   2  0  2  1  8     0
PCGAMGPOpt_AGG         4 1.0 9.5650e+00 1.0 8.23e+08 1.2 8.3e+03 9.6e+04 2.1e+02  0  0  0  0  2   6  6  5  4 15  3603

--- Event Stage 2: momentum_x

MatMult                2 1.0 1.8518e-01 1.1 3.40e+07 1.1 1.9e+02 4.4e+05 0.0e+00  0  0  0  0  0   5 23 50 80  0  8786
MatSolve               3 1.0 2.5920e-01 1.2 5.03e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   7 34  0  0  0  9297
MatLUFactorNum         1 1.0 2.7748e-01 1.1 2.81e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   7 19  0  0  0  4845
MatILUFactorSym        1 1.0 2.8246e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   7  0  0  0  4     0
MatAssemblyBegin       1 1.0 5.8029e-013185.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   5  0  0  0  7     0
MatAssemblyEnd         1 1.0 4.6551e-01 1.1 0.00e+00 0.0 1.9e+02 1.1e+05 8.0e+00  0  0  0  0  0  12  0 50 20 29     0
MatGetRowIJ            1 1.0 1.3113e-05 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 3.4148e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   1  0  0  0  7     0
KSPSetUp               2 1.0 1.0453e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
KSPSolve               1 1.0 1.2832e+00 1.0 1.50e+08 1.1 1.9e+02 4.4e+05 1.2e+01  0  0  0  0  0  34100 50 80 43  5583
VecDot                 2 1.0 3.1625e-02 2.1 5.31e+06 1.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   1  4  0  0  7  8051
VecDotNorm2            1 1.0 3.0667e-02 1.9 1.06e+07 1.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   1  7  0  0 11 16605
VecNorm                2 1.0 8.6468e-0212.2 5.31e+06 1.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   1  4  0  0  7  2945
VecCopy                2 1.0 2.2391e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 7 1.0 4.3424e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPBYCZ             2 1.0 5.0054e-02 1.5 1.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  7  0  0  0 10173
VecWAXPY               2 1.0 3.7987e-02 1.1 5.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  4  0  0  0  6703
VecAssemblyBegin       2 1.0 5.3350e-02206.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   1  0  0  0 21     0
VecAssemblyEnd         2 1.0 2.1935e-05 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        2 1.0 2.8200e-03 2.5 0.00e+00 0.0 1.9e+02 4.4e+05 0.0e+00  0  0  0  0  0   0  0 50 80  0     0
VecScatterEnd          2 1.0 1.5276e-02 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCSetUp                2 1.0 5.8300e-01 1.1 2.81e+07 1.1 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0  15 19  0  0 18  2306
PCSetUpOnBlocks        1 1.0 5.8285e-01 1.1 2.81e+07 1.1 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0  15 19  0  0 11  2307
PCApply                3 1.0 2.7368e-01 1.2 5.03e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   7 34  0  0  0  8805

--- Event Stage 3: momentum_y

MatMult                2 1.0 2.6384e-01 1.6 3.43e+07 1.1 1.9e+02 4.4e+05 0.0e+00  0  0  0  0  0   5 23 50 80  0  6228
MatSolve               3 1.0 4.3936e-01 1.9 5.08e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   7 34  0  0  0  5539
MatLUFactorNum         1 1.0 3.7870e-01 1.3 2.83e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   7 19  0  0  0  3584
MatILUFactorSym        1 1.0 6.7441e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0  10  0  0  0  4     0
MatAssemblyBegin       1 1.0 2.6880e-011700.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   3  0  0  0  7     0
MatAssemblyEnd         1 1.0 4.4579e-01 1.1 0.00e+00 0.0 1.9e+02 1.1e+05 8.0e+00  0  0  0  0  0   9  0 50 20 29     0
MatGetRowIJ            1 1.0 6.9141e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.4430e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   1  0  0  0  7     0
KSPSetUp               2 1.0 1.9449e-01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
KSPSolve               1 1.0 2.1803e+00 1.0 1.51e+08 1.1 1.9e+02 4.4e+05 1.2e+01  0  0  0  0  0  48100 50 80 43  3310
VecDot                 2 1.0 1.6268e-0112.1 5.31e+06 1.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   2  4  0  0  7  1565
VecDotNorm2            1 1.0 1.5914e-01 4.8 1.06e+07 1.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   3  7  0  0 11  3200
VecNorm                2 1.0 5.7848e-0184.8 5.31e+06 1.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   8  4  0  0  7   440
VecCopy                2 1.0 4.2936e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 7 1.0 8.5729e-02 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPBYCZ             2 1.0 9.1384e-02 2.7 1.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  7  0  0  0  5572
VecWAXPY               2 1.0 5.4740e-02 1.6 5.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  4  0  0  0  4651
VecAssemblyBegin       2 1.0 5.1133e-02181.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   1  0  0  0 21     0
VecAssemblyEnd         2 1.0 2.2888e-05 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        2 1.0 4.3221e-03 3.7 0.00e+00 0.0 1.9e+02 4.4e+05 0.0e+00  0  0  0  0  0   0  0 50 80  0     0
VecScatterEnd          2 1.0 9.5706e-0248.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
PCSetUp                2 1.0 1.1065e+00 1.7 2.83e+07 1.1 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0  18 19  0  0 18  1227
PCSetUpOnBlocks        1 1.0 1.1063e+00 1.7 2.83e+07 1.1 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0  18 19  0  0 11  1227
PCApply                3 1.0 4.5571e-01 1.8 5.08e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   7 34  0  0  0  5340
--- Event Stage 4: momentum_z

MatMult            41867 1.0 3.5526e+03 1.3 3.68e+11 1.3 5.0e+06 1.3e+05 0.0e+00 51 62 79 93  0  60 64 81 95  0  3841
MatMultAdd          6404 1.0 2.5862e+02 2.7 2.60e+10 1.5 5.9e+05 2.8e+04 0.0e+00  2  4  9  2  0   3  4 10  2  0  3266
MatMultTranspose    6404 1.0 4.1748e+02 5.0 2.60e+10 1.5 5.9e+05 2.8e+04 0.0e+00  4  4  9  2  0   4  4 10  2  0  2023
MatSolve            3637 8.4 6.2673e+01 1.9 7.38e+09 1.1 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0  5614
MatLUFactorNum       145 1.0 6.3762e+01 1.6 4.10e+09 1.1 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  3078
MatILUFactorSym        1 1.0 7.4252e-01 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin     145 1.0 4.6714e+01 8.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.9e+02  0  0  0  0  2   0  0  0  0  2     0
MatAssemblyEnd       145 1.0 3.1650e+01 2.4 0.00e+00 0.0 1.9e+02 1.1e+05 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 1.1921e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.4477e-01 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog      1601 1.0 1.4767e-01 2.1 5.36e+05 0.0 0.0e+00 0.0e+00 1.6e+03  0  0  0  0 12   0  0  0  0 13     4
KSPSetUp             290 1.0 3.6730e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             194 1.0 4.8040e+03 1.0 5.46e+11 1.2 6.2e+06 1.1e+05 1.1e+04 80 97 97 97 78  94100100100 88  4466
VecDot              1842 1.0 3.9224e+02 8.1 4.89e+09 1.0 0.0e+00 0.0e+00 1.8e+03  3  1  0  0 14   4  1  0  0 15   598
VecDotNorm2          921 1.0 3.8957e+02 6.6 9.79e+09 1.0 0.0e+00 0.0e+00 2.8e+03  3  2  0  0 20   4  2  0  0 23  1204
VecMDot             1601 1.0 1.2703e-01 2.4 2.67e+05 0.0 0.0e+00 0.0e+00 1.6e+03  0  0  0  0 12   0  0  0  0 13     2
VecNorm             4317 1.0 2.5639e+0218.7 2.96e+09 1.0 0.0e+00 0.0e+00 4.3e+03  2  1  0  0 32   2  1  0  0 36   554
VecScale           28818 1.0 8.1507e+01 3.9 9.62e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0  5489
VecCopy             8393 1.0 3.4342e+01 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             28235 1.0 3.8371e+01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY            52833 1.0 2.2300e+02 3.3 3.85e+10 1.0 0.0e+00 0.0e+00 0.0e+00  2  8  0  0  0   3  8  0  0  0  8025
VecAYPX            51232 1.0 3.0984e+02 3.0 2.40e+10 1.0 0.0e+00 0.0e+00 0.0e+00  3  5  0  0  0   4  5  0  0  0  3610
VecAXPBYCZ          1842 1.0 1.1255e+02 2.3 9.79e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0  4167
VecWAXPY            1842 1.0 1.1471e+02 2.7 4.89e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  2044
VecMAXPY            3202 1.0 1.0685e-02 5.7 5.38e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    50
VecAssemblyBegin     388 1.0 3.0027e+0160.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03  0  0  0  0  9   0  0  0  0 10     0
VecAssemblyEnd       388 1.0 3.5229e-03 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult   38424 1.0 3.4261e+02 3.4 1.44e+10 1.0 0.0e+00 0.0e+00 0.0e+00  3  3  0  0  0   4  3  0  0  0  1959
VecScatterBegin    54675 1.0 3.8022e+01 4.2 0.00e+00 0.0 6.2e+06 1.1e+05 0.0e+00  0  0 97 97  0   0  0100100  0     0
VecScatterEnd      54675 1.0 2.5222e+0320.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 20  0  0  0  0  23  0  0  0  0     0
VecNormalize        3202 1.0 1.6108e+021783.0 8.07e+05 0.0 0.0e+00 0.0e+00 3.2e+03  1  0  0  0 24   1  0  0  0 27     0
PCSetUp              290 1.0 6.4608e+01 1.6 4.10e+09 1.1 0.0e+00 0.0e+00 5.0e+00  1  1  0  0  0   1  1  0  0  0  3038
PCSetUpOnBlocks     1746 1.0 6.4608e+01 1.6 4.10e+09 1.1 0.0e+00 0.0e+00 3.0e+00  1  1  0  0  0   1  1  0  0  0  3038
PCApply             2036 1.0 3.8864e+03 1.2 4.81e+11 1.3 6.0e+06 1.0e+05 4.8e+03 61 83 94 86 36  71 85 97 89 40  4682
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Matrix    12             56   1787149752     0
       Krylov Solver     4             13        31728     0
              Vector     8             85    405124904     0
      Vector Scatter     0             12        12720     0
           Index Set     0             16     15953504     0
      Preconditioner     0             13        12540     0
              Viewer     1              0            0     0

--- Event Stage 1: poisson

              Matrix   117             76   1166561340     0
      Matrix Coarsen     4              4         2480     0
       Krylov Solver    10              4       120512     0
              Vector   225            175    331789264     0
      Vector Scatter    31             22        23320     0
           Index Set    81             74      1327712     0
      Preconditioner    11              4         3456     0
         PetscRandom     4              4         2464     0

--- Event Stage 2: momentum_x

              Matrix     1              0            0     0
       Krylov Solver     1              0            0     0
              Vector    10              1         1504     0
      Vector Scatter     1              0            0     0
           Index Set     5              2       219352     0
      Preconditioner     2              0            0     0

--- Event Stage 3: momentum_y

              Matrix     1              0            0     0
       Krylov Solver     1              0            0     0
              Vector    10              1         1504     0
      Vector Scatter     1              0            0     0
           Index Set     5              2       222304     0
      Preconditioner     2              0            0     0

	  --- Event Stage 4: momentum_z

              Matrix     1              0            0     0
       Krylov Solver     1              0            0     0
              Vector    10              1         1504     0
      Vector Scatter     1              0            0     0
           Index Set     5              2       222296     0
      Preconditioner     2              0            0     0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 3.71933e-05
Average time for zero size MPI_Send(): 1.50005e-05
#PETSc Option Table entries:
-log_summary
-poisson_pc_gamg_agg_nsmooths 1
-poisson_pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Mon Oct  1 11:36:09 2012
Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/ --with-blas-lapack-dir=/opt/intelcpro-11.1.059/mkl/lib/em64t/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.3-dev_shared_rel --known-mpi-shared=1 --with-shared-libraries
-----------------------------------------
Libraries compiled on Mon Oct  1 11:36:09 2012 on hpc12
Machine characteristics: Linux-2.6.32-279.1.1.el6.x86_64-x86_64-with-centos-6.3-Final
Using PETSc directory: /home/wtay/Codes/petsc-dev
Using PETSc arch: petsc-3.3-dev_shared_rel
-----------------------------------------


More information about the petsc-users mailing list