[petsc-users] Scaling/Preconditioners for Poisson equation
Filippo Leonardi
filippo.leonardi at sam.math.ethz.ch
Wed Oct 1 09:28:00 CDT 2014
On Wednesday 01 October 2014 07:23:03 Jed Brown wrote:
> Filippo Leonardi <filippo.leonardi at sam.math.ethz.ch> writes:
> > I am actually having hard time figuring out where I am spending my time.
> >
> > Reading the report I am spending time on KSPSolve and PCApply (e+02).
> > Since
> > the number of those operations is well under control. I guess is some
> > communication that is the bottleneck.
> >
> > The lines:
> > VecScatterBegin 4097 1.0 2.5168e+01 3.2 0.00e+00 0.0 2.9e+09 3.7e+01
> > 0.0e+00 3 0 87 39 0 10 0100100 0 0
> > VecScatterEnd 4097 1.0 1.7736e+02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00 25 0 0 0 0 88 0 0 0 0 0
> > are probably what is slowing down the solution.
>
> How large is your problem?
512^3 cells with 1 dof on 4096 processors
>
> > Also, times do not add up properly, especially in KSPSolve.
>
> Timings are inclusive, not exclusive.
>
> I don't know what is going on with the having the same event appear many
> times within the stage, but I remember fixing an issue that might have
> caused that years ago and I'd appreciate it if you would upgrade to the
> current version of PETSc.
I'll try if I can get some new version on the cluster I'm running. On my
laptop, using petsc 3.5, I get something like:
--- Event Stage 5: KspStage
VecMDot 1953 1.0 3.0402e-01 1.3 3.91e+07 1.0 0.0e+00 0.0e+00
2.0e+03 1 1 0 0 1 2 2 0 0 2 514
VecNorm 5445 1.0 3.6542e-01 1.1 4.46e+07 1.0 0.0e+00 0.0e+00
5.4e+03 1 1 0 0 3 3 2 0 0 4 488
VecScale 2465 1.0 1.8466e-02 1.1 1.01e+07 1.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 2187
VecCopy 2979 1.0 3.5844e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 9134 1.0 2.7974e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 2465 1.0 3.2354e-02 1.9 2.02e+07 1.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 1 0 0 0 2497
VecAYPX 2466 1.0 2.7912e-02 1.5 1.01e+07 1.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 1448
VecMAXPY 4418 1.0 4.7874e-02 1.1 9.42e+07 1.0 0.0e+00 0.0e+00
0.0e+00 0 1 0 0 0 0 4 0 0 0 7871
VecScatterBegin 6471 1.0 3.4048e-02 1.2 0.00e+00 0.0 4.8e+04 7.9e+02
0.0e+00 0 0 4 5 0 0 0 20 25 0 0
VecScatterEnd 6471 1.0 1.3429e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecNormalize 2466 1.0 2.5448e-01 1.0 3.03e+07 1.0 0.0e+00 0.0e+00
2.5e+03 1 0 0 0 1 2 1 0 0 2 476
MatMult 4419 1.0 3.9537e-01 1.2 1.63e+08 1.0 3.5e+04 1.0e+03
0.0e+00 1 2 3 5 0 3 7 15 24 0 1648
MatMultTranspose 1026 1.0 4.6899e-02 1.5 1.18e+07 1.0 1.2e+04 1.3e+02
0.0e+00 0 0 1 0 0 0 1 5 1 0 1008
MatLUFactorSym 513 1.0 7.8421e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 3 0 0 0 0 6 0 0 0 0 0
MatLUFactorNum 513 1.0 2.2127e+00 1.0 1.81e+09 1.0 0.0e+00 0.0e+00
0.0e+00 7 23 0 0 0 16 78 0 0 0 3269
MatConvert 513 1.0 4.4040e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 4617 1.0 4.1133e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
4.1e+03 1 0 0 0 2 3 0 0 0 3 0
MatAssemblyEnd 4617 1.0 7.7171e-01 1.0 0.00e+00 0.0 4.9e+04 5.2e+01
1.6e+04 3 0 4 0 9 6 0 21 2 13 0
MatGetRowIJ 513 1.0 8.5605e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrice 513 1.0 3.5439e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
2.1e+03 1 0 0 0 1 3 0 0 0 2 0
MatGetOrdering 513 1.0 4.3043e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 1 0 0 0 0 3 0 0 0 0 0
MatPtAP 1026 1.0 3.1987e+00 1.0 1.29e+08 1.0 1.0e+05 4.1e+02
1.7e+04 11 2 8 5 10 24 6 44 28 14 162
MatPtAPSymbolic 1026 1.0 1.9393e+00 1.0 0.00e+00 0.0 6.8e+04 4.6e+02
7.2e+03 7 0 5 4 4 15 0 28 20 6 0
MatPtAPNumeric 1026 1.0 1.2644e+00 1.0 1.29e+08 1.0 3.7e+04 3.1e+02
1.0e+04 4 2 3 1 6 10 6 15 7 8 409
MatGetRedundant 513 1.0 4.0419e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
2.1e+03 1 0 0 0 1 3 0 0 0 2 0
MatMPIConcateSeq 513 1.0 4.4424e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetLocalMat 1026 1.0 1.5555e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatGetBrAoCol 1026 1.0 1.7637e-01 1.3 0.00e+00 0.0 3.1e+04 8.0e+02
0.0e+00 1 0 2 3 0 1 0 13 16 0 0
MatGetSymTrans 2052 1.0 4.6286e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 1953 1.0 3.2750e-01 1.2 7.82e+07 1.0 0.0e+00 0.0e+00
2.0e+03 1 1 0 0 1 2 3 0 0 2 955
KSPSetUp 2565 1.0 8.8727e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
1.1e+04 3 0 0 0 6 7 0 0 0 9 0
Warning -- total time of even greater than time of entire stage -- something
is wrong with the timer
KSPSolve 513 1.0 2.7668e+01 1.0 7.65e+09 1.0 1.3e+06 5.7e+02
1.8e+05 94 98 95 90 99 209328528467143 1106
PCSetUp 513 1.0 1.0804e+01 1.0 1.95e+09 1.0 2.0e+05 5.8e+02
9.0e+04 36 25 15 15 51 81 84 85 76 73 722
Warning -- total time of even greater than time of entire stage -- something
is wrong with the timer
PCApply 2466 1.0 1.4483e+01 1.0 5.32e+09 1.0 1.0e+06 5.5e+02
5.2e+04 49 68 77 71 30 109228428367 43 1469
MGSetup Level 0 513 1.0 4.3083e+00 1.0 1.81e+09 1.0 2.1e+04 1.3e+03
1.0e+04 14 23 2 3 6 32 78 9 18 8 1679
MGSetup Level 1 513 1.0 3.4038e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
3.1e+03 1 0 0 0 2 2 0 0 0 3 0
MGSetup Level 2 513 1.0 5.1557e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
3.1e+03 2 0 0 0 2 4 0 0 0 3 0
>
> How large is your problem size and how many processors are you running on?
>
> > PS: until now I was using outputs in VTK. I guess it is better to output
> > in
> > PEtsc binary?
>
> Yes. You can use the binary-appended VTK viewer (please upgrade to
> current version of PETSc) or the PETSc binary viewer.
>
> > Is it better to output from PETSC_COMM_SELF (i.e. each processor
> > individually)?
>
> No, use collective IO.
Thanks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ETHZ.vcf
Type: text/vcard
Size: 594 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141001/682290f0/attachment.bin>
More information about the petsc-users
mailing list