# Norm computation

Barry Smith bsmith at mcs.anl.gov
Thu Dec 13 15:58:10 CST 2007

```    The time for VecNorm and VecDot reflects two factors
1) the time to perform the local floating point operations and
2) the time a process waits untill all the other processes are ready
to exchange date.

2) depends on whatever calculations are being done BEFORE the
norm or dot and is largely related to the load balancing of the work
there. If you look at the 4th column of numbers below it
is a measure for the load balance up to that point: for the VecDot it is
1.9 which means the fastest process was in the routine (mostly waiting)
1/1.9 times as long as the slowest process was in the routine. For
VecNorm it is 4! Meaning some processes are waiting in VecNorm
for a long time before the slowest gets to that routine and does its
communications.

Barry

On Dec 13, 2007, at 2:42 PM, John R. Wicks wrote:

> I recently peformed solved a linear system of very high dimension
> distributed over 32 Mac XServe's.  I was rather surprised by the
> performance
> statistics it reported, given below.  In particular, how can VecNorm
> be so
> much more expensive than VecDot, since VecNorm should simply involve
> taking
> a single square root of a dot product.
>
> --- Event Stage 2: LinearSolve
>
> MatMult               19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e
> +04
> 0.0e+00  2 17 49 49  0  16 17 50 50  0  2214
> MatMultTranspose      19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e
> +04
> 0.0e+00  2 18 49 49  0  11 18 50 50  0  2601
> MatSolve              20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00  1 18  0  0  0  10 18  0  0  0  3200
> MatSolveTranspos      20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00  2 18  0  0  0  11 18  0  0  0  2976
> MatLUFactorNum         1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e
> +00
> 0.0e+00  1 14  0  0  0  10 14  0  0  0  1635
> MatILUFactorSym        1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 1.0e+00  1  0  0  0  1   4  0  0  0  2     0
> MatGetRowIJ            1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 2.0e+00  0  0  0  0  3   0  0  0  0  3     0
> VecDot                38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e
> +00
> 3.8e+01  1  4  0  0 49  10  4  0  0 62   710
> VecNorm               20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e
> +00
> 2.0e+01  4  2  0  0 26  25  2  0  0 33   134
> VecCopy                4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecAXPY               57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00  0  6  0  0  0   2  6  0  0  0  4118
> VecAYPX               36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e
> +00
> 0.0e+00  0  4  0  0  0   1  4  0  0  0  5430
> VecScatterBegin       38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e
> +04
> 0.0e+00  0  0 97 98  0   1  0100100  0     0
> VecScatterEnd         38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00  2  0  0  0  0  12  0  0  0  0     0
> KSPSetup               2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> KSPSolve              20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00  1 18  0  0  0  10 18  0  0  0  3144
> PCSetUp                2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e
> +00
> 3.0e+00  2 14  0  0  4  13 14  0  0  5  1265
> PCApply               40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e
> +00
> 3.0e+00  5 49  0  0  4  35 49  0  0  5  2400
>

```