Norm computation
Barry Smith
bsmith at mcs.anl.gov
Thu Dec 13 15:58:10 CST 2007
The time for VecNorm and VecDot reflects two factors
1) the time to perform the local floating point operations and
2) the time a process waits untill all the other processes are ready
to exchange date.
2) depends on whatever calculations are being done BEFORE the
norm or dot and is largely related to the load balancing of the work
there. If you look at the 4th column of numbers below it
is a measure for the load balance up to that point: for the VecDot it is
1.9 which means the fastest process was in the routine (mostly waiting)
1/1.9 times as long as the slowest process was in the routine. For
VecNorm it is 4! Meaning some processes are waiting in VecNorm
for a long time before the slowest gets to that routine and does its
communications.
Barry
On Dec 13, 2007, at 2:42 PM, John R. Wicks wrote:
> I recently peformed solved a linear system of very high dimension
> distributed over 32 Mac XServe's. I was rather surprised by the
> performance
> statistics it reported, given below. In particular, how can VecNorm
> be so
> much more expensive than VecDot, since VecNorm should simply involve
> taking
> a single square root of a dot product.
>
> --- Event Stage 2: LinearSolve
>
> MatMult 19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e
> +04
> 0.0e+00 2 17 49 49 0 16 17 50 50 0 2214
> MatMultTranspose 19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e
> +04
> 0.0e+00 2 18 49 49 0 11 18 50 50 0 2601
> MatSolve 20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00 1 18 0 0 0 10 18 0 0 0 3200
> MatSolveTranspos 20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00 2 18 0 0 0 11 18 0 0 0 2976
> MatLUFactorNum 1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e
> +00
> 0.0e+00 1 14 0 0 0 10 14 0 0 0 1635
> MatILUFactorSym 1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 1.0e+00 1 0 0 0 1 4 0 0 0 2 0
> MatGetRowIJ 1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 2.0e+00 0 0 0 0 3 0 0 0 0 3 0
> VecDot 38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e
> +00
> 3.8e+01 1 4 0 0 49 10 4 0 0 62 710
> VecNorm 20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e
> +00
> 2.0e+01 4 2 0 0 26 25 2 0 0 33 134
> VecCopy 4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecAXPY 57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00 0 6 0 0 0 2 6 0 0 0 4118
> VecAYPX 36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e
> +00
> 0.0e+00 0 4 0 0 0 1 4 0 0 0 5430
> VecScatterBegin 38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e
> +04
> 0.0e+00 0 0 97 98 0 1 0100100 0 0
> VecScatterEnd 38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00 2 0 0 0 0 12 0 0 0 0 0
> KSPSetup 2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e
> +00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> KSPSolve 20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e
> +00
> 0.0e+00 1 18 0 0 0 10 18 0 0 0 3144
> PCSetUp 2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e
> +00
> 3.0e+00 2 14 0 0 4 13 14 0 0 5 1265
> PCApply 40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e
> +00
> 3.0e+00 5 49 0 0 4 35 49 0 0 5 2400
>
More information about the petsc-users
mailing list