[petsc-users] LU factorization and solution of independent matrices does not scale, why?
Thomas Witkowski
thomas.witkowski at tu-dresden.de
Thu Dec 20 14:16:52 CST 2012
In my multilevel FETI-DP code, I have localized course matrices, which
are defined on only a subset of all MPI tasks, typically between 4 and
64 tasks. The MatAIJ and the KSP objects are both defined on a MPI
communicator, which is a subset of MPI::COMM_WORLD. The LU factorization
of the matrices is computed with either MUMPS or superlu_dist, but both
show some scaling property I really wonder of: When the overall problem
size is increased, the solve with the LU factorization of the local
matrices does not scale! But why not? I just increase the number of
local matrices, but all of them are independent of each other. Some
example: I use 64 cores, each coarse matrix is spanned by 4 cores so
there are 16 MPI communicators with 16 coarse space matrices. The
problem need to solve 192 times with the coarse space systems, and this
takes together 0.09 seconds. Now I increase the number of cores to 256,
but let the local coarse space be defined again on only 4 cores. Again,
192 solutions with these coarse spaces are required, but now this takes
0.24 seconds. The same for 1024 cores, and we are at 1.7 seconds for the
local coarse space solver!
For me, this is a total mystery! Any idea how to explain, debug and
eventually how to resolve this problem?
Thomas
More information about the petsc-users
mailing list