<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
I am getting the opposite result, i.e., MUMPS becomes slower when
using ParMETIS for parallel ordering. What did I mess up? Is the
problem too small?<br>
<br>
<br>
Case 1 took 24.731s<br>
<br>
$ rm -f *vtk; time mpiexec -n 16 ./defmod -f point.inp -pc_type lu
-pc_factor_mat_solver_package mumps -mat_mumps_icntl_4 1
-log_summary > 1.txt<br>
<br>
<br>
Case 2 with "-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2" took
34.720s<br>
<br>
$ rm -f *vtk; time mpiexec -n 16 ./defmod -f point.inp -pc_type lu
-pc_factor_mat_solver_package mumps -mat_mumps_icntl_4 1
-log_summary -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 > 2.txt<br>
<br>
<br>
Both 1.txt and 2.txt are attached.<br>
<br>
Regards,<br>
<br>
Tabrez<br>
<br>
On 01/29/2014 09:18 AM, Hong Zhang wrote:
<blockquote
cite="mid:CAGCphBv5gxC+grq_TtQYcztKfgD6-PPFNCGmH1bYrOvFQpKg2w@mail.gmail.com"
type="cite">
<div dir="ltr">MUMPS now supports parallel symbolic factorization.
With petsc-3.4 interface, you can use runtime option
<div><br>
<div>
<div> -mat_mumps_icntl_28 <1>: ICNTL(28): use 1 for
sequential analysis and ictnl(7) ordering, or 2 for
parallel analysis and ictnl(29) ordering </div>
<div> -mat_mumps_icntl_29 <0>: ICNTL(29): parallel
ordering 1 = ptscotch 2 = parmetis </div>
</div>
</div>
<div><br>
</div>
<div>e.g, '-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2'
activates parallel symbolic factorization with pametis for
matrix ordering. </div>
<div>Give it a try and let us know what you get.</div>
<div><br>
</div>
<div>Hong</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Tue, Jan 28, 2014 at 5:48 PM, Smith,
Barry F. <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
On Jan 28, 2014, at 5:39 PM, Matthew Knepley <<a
moz-do-not-send="true" href="mailto:knepley@gmail.com">knepley@gmail.com</a>>
wrote:<br>
<br>
> On Tue, Jan 28, 2014 at 5:25 PM, Tabrez Ali <<a
moz-do-not-send="true"
href="mailto:stali@geology.wisc.edu">stali@geology.wisc.edu</a>>
wrote:<br>
> Hello<br>
><br>
> This is my observation as well (with MUMPS). The
first solve (after assembly which is super fast) takes a
few mins (for ~1 million unknowns on 12/24 cores) but from
then on only a few seconds for each subsequent solve for
each time step.<br>
><br>
> Perhaps symbolic factorization in MUMPS is all
serial?<br>
><br>
> Yes, it is.<br>
<br>
</div>
I missed this. I was just assuming a PETSc LU. Yes, I
have no idea of relative time of symbolic and numeric for
those other packages.<br>
<span class="HOEnZb"><font color="#888888"><br>
Barry<br>
</font></span>
<div class="HOEnZb">
<div class="h5">><br>
> Matt<br>
><br>
> Like the OP I often do multiple runs on the same
problem but I dont know if MUMPS or any other direct
solver can save the symbolic factorization info to a
file that perhaps can be utilized in subsequent reruns
to avoid the costly "first solves".<br>
><br>
> Tabrez<br>
><br>
><br>
> On 01/28/2014 04:04 PM, Barry Smith wrote:<br>
> On Jan 28, 2014, at 1:36 PM, David Liu<<a
moz-do-not-send="true" href="mailto:daveliu@mit.edu">daveliu@mit.edu</a>>
wrote:<br>
><br>
> Hi, I'm writing an application that solves a sparse
matrix many times using Pastix. I notice that the first
solves takes a very long time,<br>
> Is it the first “solve” or the first time you
put values into that matrix that “takes a long time”? If
you are not properly preallocating the matrix then the
initial setting of values will be slow and waste memory.
See <a moz-do-not-send="true"
href="http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html"
target="_blank">http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html</a><br>
><br>
> The symbolic factorization is usually much
faster than a numeric factorization so that is not the
cause of the slow “first solve”.<br>
><br>
> Barry<br>
><br>
><br>
><br>
> while the subsequent solves are very fast. I don't
fully understand what's going on behind the curtains,
but I'm guessing it's because the very first solve has
to read in the non-zero structure for the LU
factorization, while the subsequent solves are faster
because the nonzero structure doesn't change.<br>
><br>
> My question is, is there any way to save the
information obtained from the very first solve, so that
the next time I run the application, the very first
solve can be fast too (provided that I still have the
same nonzero structure)?<br>
><br>
><br>
> --<br>
> No one trusts a model except the one who wrote it;
Everyone trusts an observation except the one who made
it- Harlow Shapley<br>
><br>
><br>
><br>
><br>
> --<br>
> What most experimenters take for granted before
they begin their experiments is infinitely more
interesting than any results to which their experiments
lead.<br>
> -- Norbert Wiener<br>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
No one trusts a model except the one who wrote it; Everyone trusts an observation except the one who made it- Harlow Shapley</pre>
</body>
</html>