# [petsc-users] Fwd: How PETSc solves Ax=b in parallel

paul zhang paulhuaizhang at gmail.com
Tue Oct 22 15:27:06 CDT 2013

```Appreciate it.
Paul

On Tue, Oct 22, 2013 at 4:25 PM, Matthew Knepley <knepley at gmail.com> wrote:

>
> On Tue, Oct 22, 2013 at 3:16 PM, paul zhang <paulhuaizhang at gmail.com>wrote:
>
>> That is a good one. I mean for my case. which method can I try?
>>
>
> Both those direct methods are parallel (SuperLU_dist and MUMPS).
>
>   Matt
>
>
>> On Tue, Oct 22, 2013 at 4:14 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>
>>> On Tue, Oct 22, 2013 at 3:12 PM, paul zhang <paulhuaizhang at gmail.com>wrote:
>>>
>>>> One more question, can I solve the system in parallel?
>>>>
>>>
>>> Yes, or you would be using ETSC :)
>>>
>>>    Matt
>>>
>>>
>>>>
>>>> On Tue, Oct 22, 2013 at 4:08 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>>>
>>>>> On Tue, Oct 22, 2013 at 3:04 PM, huaibao zhang <
>>>>> paulhuaizhang at gmail.com> wrote:
>>>>>
>>>>>> Thanks for the answer. It makes sense.
>>>>>>
>>>>>> However, in my case, matrix A is huge and rather sparse, which also
>>>>>> owns a pretty good diagonal structure although there are some other
>>>>>> elements are nonzero. I have to  look for a better way to solve the system
>>>>>> more efficiently. If in parallel, it is even better.
>>>>>>
>>>>>> Attached is an example for A's structure. The pink block is a matrix
>>>>>> with 10x10 elements. The row or column in my case can be in million size.
>>>>>>
>>>>>
>>>>> The analytic character of the operator is usually more important than
>>>>> the sparsity structure for scalable solvers.
>>>>> The pattern matters a lot for direct solvers, and you should
>>>>> definitely try them (SuperLU_dist or MUMPS in PETSc).
>>>>> If they use too much memory or are too slow, then you need to
>>>>> investigate good preconditioners for iterative methods.
>>>>>
>>>>>    Matt
>>>>>
>>>>>
>>>>>> Thanks again.
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>> Huaibao (Paul) Zhang
>>>>>> *Gas Surface Interactions Lab*
>>>>>> Department of Mechanical Engineering
>>>>>> University of Kentucky,
>>>>>> Lexington, KY, 40506-0503
>>>>>> *Office*: 216 Ralph G. Anderson Building
>>>>>> *Web*:gsil.engineering.uky.edu
>>>>>>
>>>>>> On Oct 21, 2013, at 12:53 PM, Matthew Knepley <knepley at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> On Mon, Oct 21, 2013 at 11:23 AM, paul zhang <paulhuaizhang at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Jed,
>>>>>>>
>>>>>>> Thanks a lot for your answer. It really helps. I built parts of the
>>>>>>> matrix on each processor, then collected them into a global one according
>>>>>>> to their global position. Actually I used two MPI function instead of the
>>>>>>> one in the example, where the local size, as well as the global size is
>>>>>>> given.
>>>>>>> VecCreateMPI and MatCreateMPIAIJ. It does not really matter right?
>>>>>>>
>>>>>>> My continuing question is since the iteration for the system is
>>>>>>> global. Is it more efficient if I solve locally instead. ie. solve parts on
>>>>>>> each of the processor instead of doing globally.
>>>>>>>
>>>>>>
>>>>>> No, because this ignores the coupling between domains.
>>>>>>
>>>>>>   Matt
>>>>>>
>>>>>>
>>>>>>> Thanks again,
>>>>>>>
>>>>>>> Paul
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 21, 2013 at 11:42 AM, Jed Brown <jedbrown at mcs.anl.gov>wrote:
>>>>>>>
>>>>>>>> paul zhang <paulhuaizhang at gmail.com> writes:
>>>>>>>>
>>>>>>>> > I am using KSP, more specifically FGMRES method, with MPI to
>>>>>>>> solve Ax=b
>>>>>>>> > system. Here is what I am doing. I cut my computation domain into
>>>>>>>> many
>>>>>>>> > pieces, in each of them I compute independently by solving fluid
>>>>>>>> equations.
>>>>>>>> > This has nothing to do with PETSc. Finally, I collect all of the
>>>>>>>> > information and load it to a whole A matrix.
>>>>>>>>
>>>>>>>> I hope you build parts of this matrix on each processor, as is done
>>>>>>>> in
>>>>>>>> the examples.  Note the range Istart to Iend here:
>>>>>>>>
>>>>>>>>
>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html
>>>>>>>>
>>>>>>>> > My question is how PETSc functions work in parallel in my case.
>>>>>>>> There are
>>>>>>>> > two guesses to me. First, PETSc solves its own matrix for each
>>>>>>>> domain using
>>>>>>>> > local processor, although A is a global. For the values like
>>>>>>>> number of
>>>>>>>> > iterations, solution vector, their numbers should have equaled to
>>>>>>>> the
>>>>>>>> > number of processors I applied, but I get only one value for each
>>>>>>>> of them.
>>>>>>>> > The reason is that the processors must talk with each other once
>>>>>>>> all of
>>>>>>>> > their work is done, that is why I received the "all reduced"
>>>>>>>> value. This is
>>>>>>>> > more logical than my second guess.
>>>>>>>>
>>>>>>>> It does not work because the solution operators are global, so to
>>>>>>>> solve
>>>>>>>> the problem, the iteration must be global.
>>>>>>>>
>>>>>>>> > In the second one, the system is solved in parallel too. But
>>>>>>>> PETSc function
>>>>>>>> > redistributes the global sparse matrix A to each of the
>>>>>>>> processors after
>>>>>>>> > its load is complete. That is to say now each processor may not
>>>>>>>> solve the
>>>>>>>> > its own partition matrix.
>>>>>>>>
>>>>>>>> Hopefully you build the matrix already-distributed.  The default
>>>>>>>> _preconditioner_ is local, but the iteration is global.  PETSc does
>>>>>>>> not
>>>>>>>> "redistribute" the matrix automatically, though if you call
>>>>>>>> MatSetSizes() and pass PETSC_DECIDE for the local sizes, PETSc will
>>>>>>>> choose them.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Huaibao (Paul) Zhang
>>>>>>> *Gas Surface Interactions Lab*
>>>>>>>  Department of Mechanical Engineering
>>>>>>> University of Kentucky,
>>>>>>> Lexington,
>>>>>>> KY, 40506-0503*
>>>>>>> Office*: 216 Ralph G. Anderson Building
>>>>>>> *Web*:gsil.engineering.uky.edu
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Huaibao (Paul) Zhang
>>>> *Gas Surface Interactions Lab*
>>>> Department of Mechanical Engineering
>>>> University of Kentucky,
>>>> Lexington,
>>>> KY, 40506-0503*
>>>> Office*: 216 Ralph G. Anderson Building
>>>> *Web*:gsil.engineering.uky.edu
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> -- Norbert Wiener
>>>
>>
>>
>>
>> --
>> Huaibao (Paul) Zhang
>> *Gas Surface Interactions Lab*
>> Department of Mechanical Engineering
>> University of Kentucky,
>> Lexington,
>> KY, 40506-0503*
>> Office*: 216 Ralph G. Anderson Building
>> *Web*:gsil.engineering.uky.edu
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> -- Norbert Wiener
>

--
Huaibao (Paul) Zhang
*Gas Surface Interactions Lab*
Department of Mechanical Engineering
University of Kentucky,
Lexington,
KY, 40506-0503*
Office*: 216 Ralph G. Anderson Building
*Web*:gsil.engineering.uky.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131022/b681df09/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.png
Type: image/png
Size: 12602 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131022/b681df09/attachment-0001.png>
```