[petsc-users] A similar question of "Enquiry regarding log summary results"

Jed Brown jedbrown at mcs.anl.gov
Fri Oct 5 12:00:18 CDT 2012


On Fri, Oct 5, 2012 at 11:51 AM, Zhenglun (Alan) Wei <zhenglun.wei at gmail.com
> wrote:

> Dear folks,
>     I hope you're having a nice day.
>     I've been reading the thread of "Enquiry regarding log summary
> results" and tried the similar things with -log_summary with my code.
> However, my output seems totally different from Dr. TAY Wee-beng's. I
> attached it here. My code adopts the /src/ksp/ksp/example/tutorial/**ex45.c
> from Dr. Yang to solve the Poisson equation (twice). The "Event Stage 3"
> and "Event Stage 5" are just for "KSPSolve", for example:
>
>     ierr = PetscLogStagePush(stages[2]);  CHKERRQ(ierr); // Start Pressure
> Poisson Equation Solve
>     ierr = KSPSolve(ksp,PETSC_NULL,PETSC_**NULL);CHKERRQ(ierr);
>     ierr = PetscLogStagePop();  CHKERRQ(ierr); // Finish Pressure Poisson
> Equation  Solve
>
>      Therefore, solving the equation takes most of my times. In the
> procedure of solving the equation, VecMDot, MatMult, MatMultAdd,
> MatMultTranspose takes lots of time.
>
> ------------------------------**------------------------------**
> ------------------------------**------------------------------
> Event                Count      Time (sec) Flops
>   --- Global ---  --- Stage --- Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
> ------------------------------**------------------------------**
> ------------------------------**------------------------------
>
> --- Event Stage 3: Pressure Solve
>
> KSPGMRESOrthog      8643 1.0 2.0476e+0313.1 5.84e+09 1.0 0.0e+00 0.0e+00
> 8.6e+03 17 13  0  0 17  39 29  0  0 36   270
> KSPSetUp               5 1.0 1.0729e-05 3.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               5 1.0 4.0481e+03 1.0 2.05e+10 1.0 4.0e+07 1.2e+03
> 2.4e+04 44 46 47 44 46 100100100100100   477
> VecMDot             8643 1.0 2.0424e+0313.5 2.92e+09 1.0 0.0e+00 0.0e+00
> 8.6e+03 17  7  0  0 17  39 14  0  0 36   135
> VecNorm            14503 1.0 8.7184e+02 6.4 8.47e+08 1.0 0.0e+00 0.0e+00
> 1.5e+04  7  2  0  0 28  15  4  0  0 61    92
> VecScale           14503 1.0 3.7836e+00 4.7 4.23e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  2  0  0  0 10598
> VecCopy             5767 1.0 2.0257e+00 3.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet             34670 1.0 5.5035e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY             8741 1.0 2.9888e+00 4.5 5.10e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  3  0  0  0 16172
> VecAYPX             2881 1.0 1.7167e+00 7.2 8.41e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  4640
> VecWAXPY              93 1.0 5.3297e-02 4.8 2.71e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  4824
> VecMAXPY           14503 1.0 1.5191e+01 3.1 3.42e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  8  0  0  0   0 17  0  0  0 21342
> VecAssemblyBegin       5 1.0 3.7097e+0012.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.5e+01  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd         5 1.0 2.9087e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin    26022 1.0 1.1726e+01 4.2 0.00e+00 0.0 4.0e+07 1.2e+03
> 0.0e+00  0  0 47 44  0   0  0100100  0     0
> VecScatterEnd      26022 1.0 3.5268e+03 3.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 19  0  0  0  0  43  0  0  0  0     0
> VecNormalize       11524 1.0 8.4214e+02 7.2 1.01e+09 1.0 0.0e+00 0.0e+00
> 1.2e+04  6  2  0  0 22  15  5  0  0 49   114
> MatMult            14498 1.0 3.0545e+0382.7 5.50e+09 1.0 6.4e+06 7.6e+03
> 0.0e+00  9 12  8 44  0  21 27 16 98  0   169
>

This load imbalance is totally unreasonable. Either you have an atrocious
partition or there is contention on the machine (over-subscribed or other
things running). The load balance for MatMult() should be close to 1.0 and
the MFlop/s (last column) should be at least 10000 or so.


> MatMultAdd          2881 1.0 1.8177e+0218.6 1.09e+09 1.1 3.6e+06 1.5e+01
> 0.0e+00  1  2  4  0  0   2  5  9  0  0   560
> MatMultTranspose    2881 1.0 7.1296e+0274.9 1.09e+09 1.1 3.6e+06 1.5e+01
> 0.0e+00  5  2  4  0  0  11  5  9  0  0   143
> MatSolve           14405 1.0 3.9663e+01 3.3 4.56e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 10  0  0  0   1 22  0  0  0 10890
> MatLUFactorNum         5 1.0 3.9389e-02 1.6 3.05e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  7322
> MatILUFactorSym        5 1.0 4.2221e-02 3.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 5.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            5 1.0 9.7752e-06 5.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         5 1.0 2.9373e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
> PCSetUp                5 1.0 8.3110e-02 2.0 3.05e+06 1.0 0.0e+00 0.0e+00
> 1.5e+01  0  0  0  0  0   0  0  0  0  0  3470
> PCSetUpOnBlocks     5762 1.0 9.9790e-02 2.0 3.05e+06 1.0 0.0e+00 0.0e+00
> 1.5e+01  0  0  0  0  0   0  0  0  0  0  2890
> PCApply             2881 1.0 3.3151e+03 1.0 1.37e+10 1.0 3.8e+07 1.0e+03
> 1.7e+04 36 31 46 36 33  82 67 97 80 73   390
>
>        BTW, how to really calculate the time that a function costs? For
> example, it shows that "VecMDot, 2.0424e+0313.5". Would that be
> 2.0424*10^313.5 sec?
>

No, the imbalance is just spilling over. The amount of time is 2042 seconds.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121005/e53a2389/attachment.html>


More information about the petsc-users mailing list