[petsc-users] Solving a sequence of linear systems stored on disk with MUMPS
Thibaut Appel
t.appel17 at imperial.ac.uk
Thu Jul 25 11:29:41 CDT 2019
Hi Hong,
That sounds like a more reasonable approach, I had no idea the
PETSc/MUMPS interface could provide such a level of control on the solve
process. Therefore, after assembling the matrix A_0, I should do
something along the lines of:
MatGetFactor(A_0, MATSOLVERMUMPS, MAT_FACTOR_LU, F)
MatLUFactorSymbolic(F, A_0, NULL, NULL, info)
MatLUFactorNumeric(F, A_0, info)
and then call MatSolve? However I don't understand, I thought F would
remain the same during the whole process but it's an input parameter of
MatSolve so I'd need one F_m for each A_m? Which is not what you
mentioned (do one symbolic factorization only)
On a side note, after preallocating and assembling the first matrix,
should I create/assemble all the others with
MatDuplicate(A_0, MAT_DO_NOT_COPY_VALUES, A_m)
Calls to MatSetValues( ... )
MatAssemblyBegin(A_m, MAT_FINAL_ASSEMBLY)
MatAssemblyEnd(A_m, MAT_FINAL_ASSEMBLY)
Is that the recommended/most scalable way of duplicating a matrix + its
non-zero structure?
Thank you for your support and suggestions,
Thibaut
On 23/07/2019 18:38, Zhang, Hong wrote:
> Thibaut:
>
> Thanks for taking the time. I would typically run that on a small
> cluster node of 16 or 32 physical cores with 2 or 4 sockets. I use
> 16 or
> 32 MPI ranks and bind them to cores.
>
> The matrices would ALL have the same size and the same nonzero
> structure
> - it's just a few numerical values that would differ.
>
> You may do one symbolic factorization of A_m, use it in the m-i loop:
> - numeric factorization of A_m
> - solve A_m x_m,i = b_m,i
> in mumps, numeric factorization and solve are scalable. Repeated
> numeric factorization of A_m are likely faster than reading data files
> from the disc.
> Hong
>
>
> This is a good point you've raised as I don't think MUMPS is able to
> exploit that - I asked the question in their users list just to be
> sure.
> There are some options in SuperLU dist to reuse permutation
> arrays, but
> there's no I/O for that solver. And the native PETSc LU solver is not
> parallel?
>
> I'm using high-order finite differences so I'm suffering from a
> lot of
> fill-in, one of the reasons why storing factorizations in RAM is not
> viable. In comparison, I have almost unlimited disk space.
>
> I'm aware my need might seem counter-intuitive, but I'm really
> willing
> to sacrifice performance in the I/O part. My code is already heavily
> based on PETSc (preallocation, assembly for matrices/vectors) coupled
> with MUMPS I'm minimizing the loss of efficiency.
>
> Thibaut
>
> On 23/07/2019 17:13, Smith, Barry F. wrote:
> > What types of computing systems will you be doing the
> computations? Roughly how many MPI_ranks?
> >
> > Are the matrices all the same size? Do they have the same or
> different nonzero structures? Would it be possible to use the same
> symbolic representation for all of them and just have different
> numerical values?
> >
> > Clusters and large scale computing centers are notoriously
> terrible at IO; often IO is orders of magnitude slower than
> compute/memory making this type of workflow unrealistically slow.
> From a cost analysis point of view often just buying lots of
> memory might be the most efficacious approach.
> >
> > That said, what you suggest might require only a few lines of
> code (though determining where to put them is the tricky part)
> depending on the MUMPS interface for saving a filer to disk. What
> we would do is keep the PETSc wrapper that lives around the MUMPS
> matrix Mat_MUMPS but using the MUMPS API save the information in
> the DMUMPS_STRUC_C id; and then reload it when needed.
> >
> > The user level API could be something like
> >
> > MatMumpsSaveToDisk(Mat) and MatMumpsLoadFromDisk(Mat) they
> would just money with DMUMPS_STRUC_C id; item.
> >
> >
> > Barry
> >
> >
> >> On Jul 23, 2019, at 9:24 AM, Thibaut Appel via petsc-users
> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
> >>
> >> Dear PETSc users,
> >>
> >> I need to solve several linear systems successively, with LU
> factorization, as part of an iterative process in my Fortran
> application code.
> >>
> >> The process would solve M systems (A_m)(x_m,i) = (b_m,i) for
> m=1,M at each iteration i, but computing the LU factorization of
> A_m only once.
> >> The RHSs (b_m,i+1) are computed from all the different (x_m,i)
> and all depend upon each other.
> >>
> >> The way I envisage to perform that is to use MUMPS to compute,
> successively, each of the LU factorizations (m) in parallel and
> store the factors on disk, creating/assembling/destroying the
> matrices A_m on the go.
> >> Then whenever needed, read the factors in parallel to solve the
> systems. Since version 5.2, MUMPS has a save/restore feature that
> allows that, see http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf
> <http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf> p.20, 24 and 58.
> >>
> >> In its current state, the PETSc/MUMPS interface does not
> incorporate that feature. I'm an advanced Fortran programmer but
> not in C so I don't think I would do an amazing job having a go
> inside src/mat/impls/aij/mpi/mumps/mumps.c.
> >>
> >> I was picturing something like creating as many KSP objects as
> linear systems to be solved, with some sort of flag to force the
> storage of LU factors on disk after the first call to KSPSolve.
> Then keep calling KSPSolve as many times as needed.
> >>
> >> Would you support such a feature?
> >>
> >> Thanks for your support,
> >>
> >> Thibaut
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190725/d7330094/attachment.html>
More information about the petsc-users
mailing list