[petsc-users] Sometimes it's better NOT to parallelize ??? (SLEPc question?)
Jose E. Roman
jroman at dsic.upv.es
Fri Jul 29 03:19:31 CDT 2011
As stated in the documentation, the 'lapack' solver is a wrapper to LAPACK functions. LAPACK is a sequential library, so do not expect speedup for more than one process. The LAPACK wrappers are provided for convenience, as a debugging tool, for small problems. SLEPc is intended for large-scale sparse eigenproblems. We do not provide parallel eigensolvers for dense matrices.
Jose
El 29/07/2011, a las 09:56, John Chludzinski escribió:
> These are the resulting stats from decomposing a 4002x4002 (dense matrices) generalized eigenvalue problem into 2 MPI processes. Note the amount of message traffic:
>
> Max Max/Min Avg Total
> MPI Messages: 8.011e+03 1.00000 8.011e+03 1.602e+04
> MPI Message Lengths: 2.242e+08 1.00000 2.799e+04 4.485e+08
>
> Total # of messages: 1.602e+04 with an average message length: 2.799e+0.
>
>
> With the number of MPI set to 1, you get (not surprisingly):
>
> Max Max/Min Avg Total
> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
>
> In the end the time require to solve my 4002x4002 eigenvalue problem for 2 MPI processes: 1.821e+03 vs. 1.312e+03 for 1 MPI process.
>
> Am I reading this correctly?
>
>
> ---John
>
>
> Complete stats for 2 MPI process run:
>
> Using Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010
>
> Max Max/Min Avg Total
> Time (sec): 1.821e+03 1.00064 1.820e+03
> Objects: 2.005e+04 1.00000 2.005e+04
> Flops: 1.282e+11 1.00000 1.282e+11 2.564e+11
> Flops/sec: 7.046e+07 1.00064 7.044e+07 1.409e+08
> Memory: 1.286e+09 1.00000 2.571e+09
> MPI Messages: 8.011e+03 1.00000 8.011e+03 1.602e+04
> MPI Message Lengths: 2.242e+08 1.00000 2.799e+04 4.485e+08
> MPI Reductions: 2.412e+04 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length N --> 2N flops
> and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts %Total Avg %Total counts %Total
> 0: Main Stage: 1.8203e+03 100.0% 2.5645e+11 100.0% 1.602e+04 100.0% 2.799e+04 100.0% 2.007e+04 83.2%
>
>
> Complete stats for 1 MPI process run:
>
> Max Max/Min Avg Total
> Time (sec): 1.312e+03 1.00000 1.312e+03
> Objects: 2.003e+04 1.00000 2.003e+04
> Flops: 2.564e+11 1.00000 2.564e+11 2.564e+11
> Flops/sec: 1.955e+08 1.00000 1.955e+08 1.955e+08
> Memory: 1.029e+09 1.00000 1.029e+09
> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Reductions: 2.404e+04 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length N --> 2N flops
> and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts %Total Avg %Total counts %Total
> 0: Main Stage: 1.3119e+03 100.0% 2.5645e+11 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 2.002e+04 83.3%
>
>
More information about the petsc-users
mailing list