[petsc-users] Solving a sequence of linear systems stored on disk with MUMPS
Thibaut Appel
t.appel17 at imperial.ac.uk
Fri Aug 2 08:17:56 CDT 2019
Hi Hong,
That's exactly what I was looking for that's perfect.
This will be of significant help.
Have a nice weekend,
Thibaut
On 01/08/2019 19:15, Zhang, Hong wrote:
> Thibaut :
> In the branch hzhang/add-ksp-tutorials-ex6/master, I added another example
> src/mat/examples/tests/ex28.c
> which creates A[k], k=0,...,4 with same data structure.
> Using a single symbolic factor F, it runs a loop with updated
> numerical values on A[k] and solve.
> Hong
>
> Hi Hong,
>
> Thanks very much for that example I appreciate it.
>
> I'll test that in a few days and come back with questions if needed,
>
>
> Thibaut
>
> On 25/07/2019 21:25, hong at aspiritech.org
> <mailto:hong at aspiritech.org> wrote:
>> Thibaut:
>> I added an example (in the
>> branch hzhang/add-ksp-tutorials-ex6/master)
>> https://bitbucket.org/petsc/petsc/commits/cf847786fd804b3606d0281d404c4763f36fe475?at=hzhang/add-ksp-tutorials-ex6/master
>>
>> You can run it with
>> mpiexec -n 2 ./ex6 -num_numfac 2 -pc_type lu
>> -pc_factor_mat_solver_type mumps -ksp_monitor -log_view
>> ...
>> MatLUFactorSym 1 1.0 3.5911e-03
>> MatLUFactorNum 2 1.0 6.3920e-03
>>
>> This shows the code does one symbolic factorization and two
>> numeric factorizations.
>> For your convenience, the code ex6.c is attached below.
>> Let me know if you have encounter any problems.
>> Hong
>>
>>
>> On Thu, Jul 25, 2019 at 1:30 PM Zhang, Hong via petsc-users
>> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>
>> Thibaut:
>> I'm writing a simple example using KSP directly -- will send
>> to you soon.
>> Hong
>>
>> Hi Hong,
>>
>> That sounds like a more reasonable approach, I had no
>> idea the PETSc/MUMPS interface could provide such a level
>> of control on the solve process. Therefore, after
>> assembling the matrix A_0, I should do something along
>> the lines of:
>>
>> MatGetFactor(A_0, MATSOLVERMUMPS, MAT_FACTOR_LU, F)
>>
>> MatLUFactorSymbolic(F, A_0, NULL, NULL, info)
>>
>> MatLUFactorNumeric(F, A_0, info)
>>
>> and then call MatSolve? However I don't understand, I
>> thought F would remain the same during the whole process
>> but it's an input parameter of MatSolve so I'd need one
>> F_m for each A_m? Which is not what you mentioned (do one
>> symbolic factorization only)
>>
>>
>> On a side note, after preallocating and assembling the
>> first matrix, should I create/assemble all the others with
>>
>> MatDuplicate(A_0, MAT_DO_NOT_COPY_VALUES, A_m)
>>
>> Calls to MatSetValues( ... )
>>
>> MatAssemblyBegin(A_m, MAT_FINAL_ASSEMBLY)
>> MatAssemblyEnd(A_m, MAT_FINAL_ASSEMBLY)
>>
>> Is that the recommended/most scalable way of duplicating
>> a matrix + its non-zero structure?
>>
>>
>> Thank you for your support and suggestions,
>>
>> Thibaut
>>
>>
>> On 23/07/2019 18:38, Zhang, Hong wrote:
>>> Thibaut:
>>>
>>> Thanks for taking the time. I would typically run
>>> that on a small
>>> cluster node of 16 or 32 physical cores with 2 or 4
>>> sockets. I use 16 or
>>> 32 MPI ranks and bind them to cores.
>>>
>>> The matrices would ALL have the same size and the
>>> same nonzero structure
>>> - it's just a few numerical values that would differ.
>>>
>>> You may do one symbolic factorization of A_m, use it in
>>> the m-i loop:
>>> - numeric factorization of A_m
>>> - solve A_m x_m,i = b_m,i
>>> in mumps, numeric factorization and solve are scalable.
>>> Repeated numeric factorization of A_m are likely faster
>>> than reading data files from the disc.
>>> Hong
>>>
>>>
>>> This is a good point you've raised as I don't think
>>> MUMPS is able to
>>> exploit that - I asked the question in their users
>>> list just to be sure.
>>> There are some options in SuperLU dist to reuse
>>> permutation arrays, but
>>> there's no I/O for that solver. And the native PETSc
>>> LU solver is not
>>> parallel?
>>>
>>> I'm using high-order finite differences so I'm
>>> suffering from a lot of
>>> fill-in, one of the reasons why storing
>>> factorizations in RAM is not
>>> viable. In comparison, I have almost unlimited disk
>>> space.
>>>
>>> I'm aware my need might seem counter-intuitive, but
>>> I'm really willing
>>> to sacrifice performance in the I/O part. My code is
>>> already heavily
>>> based on PETSc (preallocation, assembly for
>>> matrices/vectors) coupled
>>> with MUMPS I'm minimizing the loss of efficiency.
>>>
>>> Thibaut
>>>
>>> On 23/07/2019 17:13, Smith, Barry F. wrote:
>>> > What types of computing systems will you be
>>> doing the computations? Roughly how many MPI_ranks?
>>> >
>>> > Are the matrices all the same size? Do they have
>>> the same or different nonzero structures? Would it
>>> be possible to use the same symbolic representation
>>> for all of them and just have different numerical
>>> values?
>>> >
>>> > Clusters and large scale computing centers are
>>> notoriously terrible at IO; often IO is orders of
>>> magnitude slower than compute/memory making this
>>> type of workflow unrealistically slow. From a cost
>>> analysis point of view often just buying lots of
>>> memory might be the most efficacious approach.
>>> >
>>> > That said, what you suggest might require only
>>> a few lines of code (though determining where to put
>>> them is the tricky part) depending on the MUMPS
>>> interface for saving a filer to disk. What we would
>>> do is keep the PETSc wrapper that lives around the
>>> MUMPS matrix Mat_MUMPS but using the MUMPS API save
>>> the information in the DMUMPS_STRUC_C id; and then
>>> reload it when needed.
>>> >
>>> > The user level API could be something like
>>> >
>>> > MatMumpsSaveToDisk(Mat) and
>>> MatMumpsLoadFromDisk(Mat) they would just money with
>>> DMUMPS_STRUC_C id; item.
>>> >
>>> >
>>> > Barry
>>> >
>>> >
>>> >> On Jul 23, 2019, at 9:24 AM, Thibaut Appel via
>>> petsc-users <petsc-users at mcs.anl.gov
>>> <mailto:petsc-users at mcs.anl.gov>> wrote:
>>> >>
>>> >> Dear PETSc users,
>>> >>
>>> >> I need to solve several linear systems
>>> successively, with LU factorization, as part of an
>>> iterative process in my Fortran application code.
>>> >>
>>> >> The process would solve M systems (A_m)(x_m,i) =
>>> (b_m,i) for m=1,M at each iteration i, but computing
>>> the LU factorization of A_m only once.
>>> >> The RHSs (b_m,i+1) are computed from all the
>>> different (x_m,i) and all depend upon each other.
>>> >>
>>> >> The way I envisage to perform that is to use
>>> MUMPS to compute, successively, each of the LU
>>> factorizations (m) in parallel and store the factors
>>> on disk, creating/assembling/destroying the matrices
>>> A_m on the go.
>>> >> Then whenever needed, read the factors in
>>> parallel to solve the systems. Since version 5.2,
>>> MUMPS has a save/restore feature that allows that,
>>> see http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf
>>> p.20, 24 and 58.
>>> >>
>>> >> In its current state, the PETSc/MUMPS interface
>>> does not incorporate that feature. I'm an advanced
>>> Fortran programmer but not in C so I don't think I
>>> would do an amazing job having a go inside
>>> src/mat/impls/aij/mpi/mumps/mumps.c.
>>> >>
>>> >> I was picturing something like creating as many
>>> KSP objects as linear systems to be solved, with
>>> some sort of flag to force the storage of LU factors
>>> on disk after the first call to KSPSolve. Then keep
>>> calling KSPSolve as many times as needed.
>>> >>
>>> >> Would you support such a feature?
>>> >>
>>> >> Thanks for your support,
>>> >>
>>> >> Thibaut
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190802/b3329283/attachment.html>
More information about the petsc-users
mailing list