[petsc-users] Error using MUMPS to solve large linear system
Barry Smith
bsmith at mcs.anl.gov
Tue Feb 25 09:57:19 CST 2014
On Feb 25, 2014, at 8:23 AM, Samar Khatiwala <spk at ldeo.columbia.edu> wrote:
> Hi Sherry,
>
> Thanks for the offer to help!
>
> I tried superlu_dist again and it crashes even more quickly than MUMPS with just the following error:
>
> ERROR: 0031-250 task 128: Killed
This is usually a symptom of running out of memory.
>
> Absolutely nothing else is written out to either stderr or stdout. This is with -mat_superlu_dist_statprint.
> The program works fine on a smaller matrix.
>
> This is the sequence of calls:
>
> KSPSetType(ksp,KSPPREONLY);
> PCSetType(pc,PCLU);
> PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST);
> KSPSetFromOptions(ksp);
> PCSetFromOptions(pc);
> KSPSolve(ksp,b,x);
>
> All of these successfully return *except* the very last one to KSPSolve.
>
> Any help would be appreciated. Thanks!
>
> Samar
>
> On Feb 24, 2014, at 3:58 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
>> Samar:
>> If you include the error message while crashing using superlu_dist, I probably know the reason. (better yet, include the printout before the crash. )
>>
>> Sherry
>>
>>
>> On Mon, Feb 24, 2014 at 9:56 AM, Hong Zhang <hzhang at mcs.anl.gov> wrote:
>> Samar :
>> There are limitations for direct solvers.
>> Do not expect any solver can be used on arbitrarily large problems.
>> Since superlu_dist also crashes, direct solvers may not be able to work on your application.
>> This is why I suggest to increase size incrementally.
>> You may have to experiment other type of solvers.
>>
>> Hong
>>
>> Hi Hong and Jed,
>>
>> Many thanks for replying. It would indeed be nice if the error messages from MUMPS were less cryptic!
>>
>> 1) I have tried smaller matrices although given how my problem is set up a jump is difficult to avoid. But a good idea
>> that I will try.
>>
>> 2) I did try various ordering but not the one you suggested.
>>
>> 3) Tracing the error through the MUMPS code suggest a rather abrupt termination of the program (there should be more
>> error messages if, for example, memory was a problem). I therefore thought it might be an interface problem rather than
>> one with mumps and turned to the petsc-users group first.
>>
>> 4) I've tried superlu_dist but it also crashes (also unclear as to why) at which point I decided to try mumps. The fact that both
>> crash would again indicate a common (memory?) problem.
>>
>> I'll try a few more things before asking the MUMPS developers.
>>
>> Thanks again for your help!
>>
>> Samar
>>
>> On Feb 24, 2014, at 11:47 AM, Hong Zhang <hzhang at mcs.anl.gov> wrote:
>>
>>> Samar:
>>> The crash occurs in
>>> ...
>>> [161]PETSC ERROR: Error in external library!
>>> [161]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFO(1)=-1, INFO(2)=48
>>>
>>> for very large matrix, likely memory problem as you suspected.
>>> I would suggest
>>> 1. run problems with increased sizes (not jump from a small one to a very large one) and observe memory usage using
>>> '-ksp_view'.
>>> I see you use '-mat_mumps_icntl_14 1000', i.e., percentage of estimated workspace increase. Is it too large?
>>> Anyway, this input should not cause the crash, I guess.
>>> 2. experimenting with different matrix ordering -mat_mumps_icntl_7 <> (I usually use sequential ordering 2)
>>> I see you use parallel ordering -mat_mumps_icntl_29 2.
>>> 3. send bug report to mumps developers for their suggestion.
>>>
>>> 4. try other direct solvers, e.g., superlu_dist.
>>>
>>> …
>>>
>>> etc etc. The above error I can tell has something to do with processor 48 (INFO(2)) and so forth but not the previous one.
>>>
>>> The full output enabled with -mat_mumps_icntl_4 3 looks as in the attached file. Any hints as to what could be giving this
>>> error would be very much appreciated.
>>>
>>> I do not know how to interpret this output file. mumps developer would give you better suggestion on it.
>>> I would appreciate to learn as well :-)
>>>
>>> Hong
>>
>>
>>
>
More information about the petsc-users
mailing list