Load Balancing and KSPSolve

Tim Stitt timothy.stitt at ichec.ie
Tue Nov 20 11:45:31 CST 2007

Hi all (again),

I finally got some data back from the KSP PETSc code that I put together 
to solve this sparse inverse matrix problem I was looking into. Ideally 
I am aiming for a O(N) (time complexity) approach to getting the first 
'k' columns of the inverse of a sparse matrix.

To recap the method: I have my solver which uses KSPSolve in a loop that 
iterates over the first k columns of an identity matrix B and computes 
the corresponding x vector.

I am just a bit curious about some of the timings I am obtaining...which 
I hope someone can explain. Here are the timings I obtained for a global 
sparse matrix (4704 x 4704) and solving for the first 1176 columns in 
the identity using P processes (processors) on our cluster.

(Timings are given in seconds for each process performing work in the 
loop and were obtained by encapsulating the loop with the cpu_time() 
Fortran intrinsic. The MUMPS package was requested for 
factorisation/solving, although similar timings were obtained for both 
the native solver and SUPERLU)

P=1  [30.92]
P=2  [15.47, 15.54]
P=4  [4.68, 5.49, 4.67, 5.07]
P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 
0.25, 0.43, 1.09, 1.08, 1.1]

Firstly, I notice very good scalability up to 16 processes...is this 
expected (by those people who use these solvers regularly)?

Also I notice that the timings per process vary as we scale up. Is this 
a load-balancing problem related to more non-zero values being on a 
given processor than others? Once again is this expected?

Please excuse my ignorance of matters relating to these solvers and 
their operation...as it really isn't my field of expertise.



