<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi Hong,</p>

    <p>That sounds like a more reasonable approach, I had no idea the

      PETSc/MUMPS interface could provide such a level of control on the

      solve process. Therefore, after assembling the matrix A_0, I

      should do something along the lines of:</p>

    <p><font face="Courier New, Courier, monospace">MatGetFactor(A_0,

        MATSOLVERMUMPS, MAT_FACTOR_LU, F)</font></p>

    <p><font face="Courier New, Courier, monospace">MatLUFactorSymbolic(F,

        A_0, NULL, NULL, info)</font></p>

    <p><font face="Courier New, Courier, monospace">MatLUFactorNumeric(F,

        A_0, info)</font></p>

    <p>and then call MatSolve? However I don't understand, I thought F

      would remain the same during the whole process but it's an input

      parameter of MatSolve so I'd need one F_m for each A_m? Which is

      not what you mentioned (do one symbolic factorization only)</p>

    <p><br>

    </p>

    <p>On a side note, after preallocating and assembling the first

      matrix, should I create/assemble all the others with</p>

    <p><font face="Courier New, Courier, monospace">MatDuplicate(A_0,

        MAT_DO_NOT_COPY_VALUES, A_m)</font></p>

    <p><font face="Courier New, Courier, monospace">Calls to

        MatSetValues( ... )<br>

      </font></p>

    <font face="Courier New, Courier, monospace">MatAssemblyBegin(A_m,

      MAT_FINAL_ASSEMBLY)<br>

      MatAssemblyEnd(A_m, MAT_FINAL_ASSEMBLY)</font><br>

    <br>

    <p>Is that the recommended/most scalable way of duplicating a matrix

      + its non-zero structure?<br>

    </p>

    <p><br>

    </p>

    <p>Thank you for your support and suggestions,</p>

    <p>Thibaut</p>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 23/07/2019 18:38, Zhang, Hong wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAGCphBtG=JGCZg8=g36Di8c5JuZVwzw3M_-EE-ook0+8Q6ZCfg@mail.gmail.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="ltr">Thibaut:</div>

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            Thanks for taking the time. I would typically run that on a

            small <br>

            cluster node of 16 or 32 physical cores with 2 or 4 sockets.

            I use 16 or <br>

            32 MPI ranks and bind them to cores.<br>

            <br>

            The matrices would ALL have the same size and the same

            nonzero structure <br>

            - it's just a few numerical values that would differ.<br>

          </blockquote>

          <div>You may do one symbolic factorization of A_m, use it in

            the m-i loop:</div>

          <div>- numeric factorization of A_m</div>

          <div>- solve A_m x_m,i = b_m,i</div>

          <div>in mumps, numeric factorization and solve are scalable.

            Repeated numeric factorization of A_m are likely faster than

            reading data files from the disc.</div>

          <div>Hong</div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <br>

            This is a good point you've raised as I don't think MUMPS is

            able to <br>

            exploit that - I asked the question in their users list just

            to be sure. <br>

            There are some options in SuperLU dist to reuse permutation

            arrays, but <br>

            there's no I/O for that solver. And the native PETSc LU

            solver is not <br>

            parallel?<br>

            <br>

            I'm using high-order finite differences so I'm suffering

            from a lot of <br>

            fill-in, one of the reasons why storing factorizations in

            RAM is not <br>

            viable. In comparison, I have almost unlimited disk space.<br>

            <br>

            I'm aware my need might seem counter-intuitive, but I'm

            really willing <br>

            to sacrifice performance in the I/O part. My code is already

            heavily <br>

            based on PETSc (preallocation, assembly for

            matrices/vectors) coupled <br>

            with MUMPS I'm minimizing the loss of efficiency.<br>

            <br>

            Thibaut<br>

            <br>

            On 23/07/2019 17:13, Smith, Barry F. wrote:<br>

            >    What types of computing systems will you be doing

            the computations? Roughly how many MPI_ranks?<br>

            ><br>

            > Are the matrices all the same size? Do they have the

            same or different nonzero structures? Would it be possible

            to use the same symbolic representation for all of them and

            just have different numerical values?<br>

            ><br>

            >    Clusters and large scale computing centers are

            notoriously terrible at IO; often IO is orders of magnitude

            slower than compute/memory making this type of workflow

            unrealistically slow. From a cost analysis point of view

            often just buying lots of memory might be the most 

            efficacious approach.<br>

            ><br>

            >    That said, what you suggest might require only a few

            lines of code (though determining where to put them is the

            tricky part) depending on the MUMPS interface for saving a

            filer to disk. What we would do is keep the PETSc wrapper

            that lives around the MUMPS matrix Mat_MUMPS but using the

            MUMPS API save the information in the DMUMPS_STRUC_C id; and

            then reload it when needed.<br>

            ><br>

            >    The user level API could be something like<br>

            ><br>

            >    MatMumpsSaveToDisk(Mat) and

            MatMumpsLoadFromDisk(Mat) they would just money with

            DMUMPS_STRUC_C id; item.<br>

            ><br>

            ><br>

            >    Barry<br>

            ><br>

            ><br>

            >> On Jul 23, 2019, at 9:24 AM, Thibaut Appel via

            petsc-users <<a href="mailto:petsc-users@mcs.anl.gov"

              target="_blank" moz-do-not-send="true">petsc-users@mcs.anl.gov</a>>

            wrote:<br>

            >><br>

            >> Dear PETSc users,<br>

            >><br>

            >> I need to solve several linear systems

            successively, with LU factorization, as part of an iterative

            process in my Fortran application code.<br>

            >><br>

            >> The process would solve M systems (A_m)(x_m,i) =

            (b_m,i) for m=1,M at each iteration i, but computing the LU

            factorization of A_m only once.<br>

            >> The RHSs (b_m,i+1) are computed from all the

            different (x_m,i) and all depend upon each other.<br>

            >><br>

            >> The way I envisage to perform that is to use MUMPS

            to compute, successively, each of the LU factorizations (m)

            in parallel and store the factors on disk,

            creating/assembling/destroying the matrices A_m on the go.<br>

            >> Then whenever needed, read the factors in parallel

            to solve the systems. Since version 5.2, MUMPS has a

            save/restore feature that allows that, see

            <a href="http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf"

              rel="noreferrer" target="_blank" moz-do-not-send="true">

              http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf</a> p.20,

            24 and 58.<br>

            >><br>

            >> In its current state, the PETSc/MUMPS interface

            does not incorporate that feature. I'm an advanced Fortran

            programmer but not in C so I don't think I would do an

            amazing job having a go inside

            src/mat/impls/aij/mpi/mumps/mumps.c.<br>

            >><br>

            >> I was picturing something like creating as many KSP

            objects as linear systems to be solved, with some sort of

            flag to force the storage of LU factors on disk after the

            first call to KSPSolve. Then keep calling KSPSolve as many

            times as needed.<br>

            >><br>

            >> Would you support such a feature?<br>

            >><br>

            >> Thanks for your support,<br>

            >><br>

            >> Thibaut<br>

          </blockquote>

        </div>

      </div>

    </blockquote>

  </body>

</html>