[petsc-users] LU factorization and solution of independent matrices does not scale, why?
Thomas Witkowski
Thomas.Witkowski at tu-dresden.de
Thu Dec 20 14:19:59 CST 2012
In my multilevel FETI-DP code, I have localized course matrices, which
are defined on only a subset of all MPI tasks, typically between 4 and
64 tasks. The MatAIJ and the KSP objects are both defined on a MPI
communicator, which is a subset of MPI::COMM_WORLD. The LU
factorization of the matrices is computed with either MUMPS or
superlu_dist, but both show some scaling property I really wonder of:
When the overall problem size is increased, the solve with the LU
factorization of the local matrices does not scale! But why not? I
just increase the number of local matrices, but all of them are
independent of each other. Some example: I use 64 cores, each coarse
matrix is spanned by 4 cores so there are 16 MPI communicators with 16
coarse space matrices. The problem need to solve 192 times with the
coarse space systems, and this takes together 0.09 seconds. Now I
increase the number of cores to 256, but let the local coarse space be
defined again on only 4 cores. Again, 192 solutions with these coarse
spaces are required, but now this takes 0.24 seconds. The same for
1024 cores, and we are at 1.7 seconds for the local coarse space solver!
For me, this is a total mystery! Any idea how to explain, debug and
eventually how to resolve this problem?
Thomas
More information about the petsc-users
mailing list