Norm computation
John R. Wicks
jwicks at cs.brown.edu
Thu Dec 13 14:42:14 CST 2007
I recently peformed solved a linear system of very high dimension
distributed over 32 Mac XServe's. I was rather surprised by the performance
statistics it reported, given below. In particular, how can VecNorm be so
much more expensive than VecDot, since VecNorm should simply involve taking
a single square root of a dot product.
--- Event Stage 2: LinearSolve
MatMult 19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e+04
0.0e+00 2 17 49 49 0 16 17 50 50 0 2214
MatMultTranspose 19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e+04
0.0e+00 2 18 49 49 0 11 18 50 50 0 2601
MatSolve 20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e+00
0.0e+00 1 18 0 0 0 10 18 0 0 0 3200
MatSolveTranspos 20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e+00
0.0e+00 2 18 0 0 0 11 18 0 0 0 2976
MatLUFactorNum 1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e+00
0.0e+00 1 14 0 0 0 10 14 0 0 0 1635
MatILUFactorSym 1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+00 1 0 0 0 1 4 0 0 0 2 0
MatGetRowIJ 1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00 0 0 0 0 3 0 0 0 0 3 0
VecDot 38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e+00
3.8e+01 1 4 0 0 49 10 4 0 0 62 710
VecNorm 20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e+00
2.0e+01 4 2 0 0 26 25 2 0 0 33 134
VecCopy 4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecAXPY 57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e+00
0.0e+00 0 6 0 0 0 2 6 0 0 0 4118
VecAYPX 36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e+00
0.0e+00 0 4 0 0 0 1 4 0 0 0 5430
VecScatterBegin 38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e+04
0.0e+00 0 0 97 98 0 1 0100100 0 0
VecScatterEnd 38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 2 0 0 0 0 12 0 0 0 0 0
KSPSetup 2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 1 0 0 0 0 0
KSPSolve 20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e+00
0.0e+00 1 18 0 0 0 10 18 0 0 0 3144
PCSetUp 2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e+00
3.0e+00 2 14 0 0 4 13 14 0 0 5 1265
PCApply 40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e+00
3.0e+00 5 49 0 0 4 35 49 0 0 5 2400
More information about the petsc-users
mailing list