<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    I am getting the opposite result, i.e., MUMPS becomes slower when

    using ParMETIS for parallel ordering. What did I mess up? Is the

    problem too small?<br>

    <br>

    <br>

    Case 1 took 24.731s<br>

    <br>

    $ rm -f *vtk; time mpiexec -n 16 ./defmod -f point.inp -pc_type lu

    -pc_factor_mat_solver_package mumps -mat_mumps_icntl_4 1

    -log_summary > 1.txt<br>

    <br>

    <br>

    Case 2 with "-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2" took

    34.720s<br>

    <br>

    $ rm -f *vtk; time mpiexec -n 16 ./defmod -f point.inp -pc_type lu

    -pc_factor_mat_solver_package mumps -mat_mumps_icntl_4 1

    -log_summary -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 > 2.txt<br>

    <br>

    <br>

    Both 1.txt and 2.txt are attached.<br>

    <br>

    Regards,<br>

    <br>

    Tabrez<br>

    <br>

    On 01/29/2014 09:18 AM, Hong Zhang wrote:

    <blockquote

cite="mid:CAGCphBv5gxC+grq_TtQYcztKfgD6-PPFNCGmH1bYrOvFQpKg2w@mail.gmail.com"

      type="cite">

      <div dir="ltr">MUMPS now supports parallel symbolic factorization.

        With petsc-3.4 interface, you can use runtime option

        <div><br>

          <div>

            <div>  -mat_mumps_icntl_28 <1>: ICNTL(28): use 1 for

              sequential analysis and ictnl(7) ordering, or 2 for

              parallel analysis and ictnl(29) ordering </div>

            <div>  -mat_mumps_icntl_29 <0>: ICNTL(29): parallel

              ordering 1 = ptscotch 2 = parmetis </div>

          </div>

        </div>

        <div><br>

        </div>

        <div>e.g, '-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2'

          activates parallel symbolic factorization with pametis for

          matrix ordering. </div>

        <div>Give it a try and let us know what you get.</div>

        <div><br>

        </div>

        <div>Hong</div>

      </div>

      <div class="gmail_extra"><br>

        <br>

        <div class="gmail_quote">On Tue, Jan 28, 2014 at 5:48 PM, Smith,

          Barry F. <span dir="ltr"><<a moz-do-not-send="true"

              href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div class="im"><br>

              On Jan 28, 2014, at 5:39 PM, Matthew Knepley <<a

                moz-do-not-send="true" href="mailto:knepley@gmail.com">knepley@gmail.com</a>>

              wrote:<br>

              <br>

              > On Tue, Jan 28, 2014 at 5:25 PM, Tabrez Ali <<a

                moz-do-not-send="true"

                href="mailto:stali@geology.wisc.edu">stali@geology.wisc.edu</a>>

              wrote:<br>

              > Hello<br>

              ><br>

              > This is my observation as well (with MUMPS). The

              first solve (after assembly which is super fast) takes a

              few mins (for ~1 million unknowns on 12/24 cores) but from

              then on only a few seconds for each subsequent solve for

              each time step.<br>

              ><br>

              > Perhaps symbolic factorization in MUMPS is all

              serial?<br>

              ><br>

              > Yes, it is.<br>

              <br>

            </div>

               I missed this. I was just assuming a PETSc LU. Yes, I

            have no idea of relative time of symbolic and numeric for

            those other packages.<br>

            <span class="HOEnZb"><font color="#888888"><br>

                  Barry<br>

              </font></span>

            <div class="HOEnZb">

              <div class="h5">><br>

                >   Matt<br>

                ><br>

                > Like the OP I often do multiple runs on the same

                problem but I dont know if MUMPS or any other direct

                solver can save the symbolic factorization info to a

                file that perhaps can be utilized in subsequent reruns

                to avoid the costly "first solves".<br>

                ><br>

                > Tabrez<br>

                ><br>

                ><br>

                > On 01/28/2014 04:04 PM, Barry Smith wrote:<br>

                > On Jan 28, 2014, at 1:36 PM, David Liu<<a

                  moz-do-not-send="true" href="mailto:daveliu@mit.edu">daveliu@mit.edu</a>>

                 wrote:<br>

                ><br>

                > Hi, I'm writing an application that solves a sparse

                matrix many times using Pastix. I notice that the first

                solves takes a very long time,<br>

                >    Is it the first “solve” or the first time you

                put values into that matrix that “takes a long time”? If

                you are not properly preallocating the matrix then the

                initial setting of values will be slow and waste memory.

                 See <a moz-do-not-send="true"

href="http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html"

                  target="_blank">http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html</a><br>

                ><br>

                >    The symbolic factorization is usually much

                faster than a numeric factorization so that is not the

                cause of the slow “first solve”.<br>

                ><br>

                >     Barry<br>

                ><br>

                ><br>

                ><br>

                > while the subsequent solves are very fast. I don't

                fully understand what's going on behind the curtains,

                but I'm guessing it's because the very first solve has

                to read in the non-zero structure for the LU

                factorization, while the subsequent solves are faster

                because the nonzero structure doesn't change.<br>

                ><br>

                > My question is, is there any way to save the

                information obtained from the very first solve, so that

                the next time I run the application, the very first

                solve can be fast too (provided that I still have the

                same nonzero structure)?<br>

                ><br>

                ><br>

                > --<br>

                > No one trusts a model except the one who wrote it;

                Everyone trusts an observation except the one who made

                it- Harlow Shapley<br>

                ><br>

                ><br>

                ><br>

                ><br>

                > --<br>

                > What most experimenters take for granted before

                they begin their experiments is infinitely more

                interesting than any results to which their experiments

                lead.<br>

                > -- Norbert Wiener<br>

                <br>

              </div>

            </div>

          </blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

    <br>

    <pre class="moz-signature" cols="72">-- 

No one trusts a model except the one who wrote it; Everyone trusts an observation except the one who made it- Harlow Shapley</pre>

  </body>

</html>