This is my observation as well (with MUMPS). The first solve (after 
assembly which is super fast) takes a few mins (for ~1 million unknowns 
on 12/24 cores) but from then on only a few seconds for each subsequent 
solve for each time step.

Perhaps symbolic factorization in MUMPS is all serial?

Like the OP I often do multiple runs on the same problem but I dont know 
if MUMPS or any other direct solver can save the symbolic factorization 
info to a file that perhaps can be utilized in subsequent reruns to 
avoid the costly "first solves".


On 01/28/2014 04:04 PM, Barry Smith wrote:
> On Jan 28, 2014, at 1:36 PM, David Liu<daveliu at mit.edu>  wrote:
>> Hi, I'm writing an application that solves a sparse matrix many times using Pastix. I notice that the first solves takes a very long time,
>    Is it the first “solve” or the first time you put values into that matrix that “takes a long time”? If you are not properly preallocating the matrix then the initial setting of values will be slow and waste memory.  See http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html
>    The symbolic factorization is usually much faster than a numeric factorization so that is not the cause of the slow “first solve”.
>     Barry
>> while the subsequent solves are very fast. I don't fully understand what's going on behind the curtains, but I'm guessing it's because the very first solve has to read in the non-zero structure for the LU factorization, while the subsequent solves are faster because the nonzero structure doesn't change.
>> My question is, is there any way to save the information obtained from the very first solve, so that the next time I run the application, the very first solve can be fast too (provided that I still have the same nonzero structure)?

