<div dir="ltr">On Tue, Oct 22, 2013 at 3:04 PM, huaibao zhang <span dir="ltr"><<a href="mailto:paulhuaizhang@gmail.com" target="_blank">paulhuaizhang@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Thanks for the answer. It makes sense. <div><br></div><div>However, in my case, matrix A is huge and rather sparse, which also owns a pretty good diagonal structure although there are some other elements are nonzero. I have to look for a better way to solve the system more efficiently. If in parallel, it is even better. </div>
<div><br></div><div>Attached is an example for A's structure. The pink block is a matrix with 10x10 elements. The row or column in my case can be in million size. </div></div></blockquote><div><br></div><div>The analytic character of the operator is usually more important than the sparsity structure for scalable solvers.</div>
<div>The pattern matters a lot for direct solvers, and you should definitely try them (SuperLU_dist or MUMPS in PETSc).</div><div>If they use too much memory or are too slow, then you need to investigate good preconditioners for iterative methods.</div>
<div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Thanks again. </div><div>Paul</div>
<div><br></div><div><img height="386" width="409" src="cid:5E4916B9-C2FB-4380-B958-984D22702564@campus.uky.edu"></div><div><div><br></div><div><br><div>
<div style="text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:normal;text-transform:none;font-size:medium;white-space:normal;font-family:Helvetica;word-wrap:break-word;word-spacing:0px">
<div><font size="1">--</font></div><div><font size="1">Huaibao (Paul) Zhang<br><b><i>Gas Surface Interactions Lab</i></b><br></font></div><div><font size="1">Department of Mechanical Engineering</font></div><div><font size="1">University of Kentucky,</font></div>
<div><font size="1">Lexington, </font><span style="font-size:x-small">KY, 40506-0503</span></div><div style="margin:0px"><font size="1"><b>Office</b>: 216 Ralph G. Anderson Building<br><b>Web</b>:<a href="http://gsil.engineering.uky.edu/" target="_blank"><span style="color:rgb(0,0,153)">gsil.engineering.uky.edu</span></a></font></div>
</div>
</div>
<br><div><div>On Oct 21, 2013, at 12:53 PM, Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:</div><br><blockquote type="cite"><div dir="ltr">On Mon, Oct 21, 2013 at 11:23 AM, paul zhang <span dir="ltr"><<a href="mailto:paulhuaizhang@gmail.com" target="_blank">paulhuaizhang@gmail.com</a>></span> wrote:<br>
<div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div style="font-size:small">Hi Jed,<br><br>Thanks a lot for your answer. It really helps. I built parts of the matrix on each processor, then collected them into a global one according to their global position. Actually I used two MPI function instead of the one in the example, where the local size, as well as the global size is given.<br>
VecCreateMPI and MatCreateMPIAIJ. It does not really matter right? <br><br></div><div style="font-size:small">My continuing question is since the iteration for the system is global. Is it more efficient if I solve locally instead. ie. solve parts on each of the processor instead of doing globally. <br>
</div></div></blockquote><div><br></div><div>No, because this ignores the coupling between domains.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div style="font-size:small">Thanks again,<br><br>Paul<br></div><div style="font-size:small"><br><br><br><br><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Oct 21, 2013 at 11:42 AM, Jed Brown <span dir="ltr"><<a href="mailto:jedbrown@mcs.anl.gov" target="_blank">jedbrown@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>paul zhang <<a href="mailto:paulhuaizhang@gmail.com" target="_blank">paulhuaizhang@gmail.com</a>> writes:<br>
<br>
> I am using KSP, more specifically FGMRES method, with MPI to solve Ax=b<br>
> system. Here is what I am doing. I cut my computation domain into many<br>
> pieces, in each of them I compute independently by solving fluid equations.<br>
> This has nothing to do with PETSc. Finally, I collect all of the<br>
> information and load it to a whole A matrix.<br>
<br>
</div>I hope you build parts of this matrix on each processor, as is done in<br>
the examples. Note the range Istart to Iend here:<br>
<br>
<a href="http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html" target="_blank">http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html</a><br>
<div><br>
> My question is how PETSc functions work in parallel in my case. There are<br>
> two guesses to me. First, PETSc solves its own matrix for each domain using<br>
> local processor, although A is a global. For the values like number of<br>
> iterations, solution vector, their numbers should have equaled to the<br>
> number of processors I applied, but I get only one value for each of them.<br>
> The reason is that the processors must talk with each other once all of<br>
> their work is done, that is why I received the "all reduced" value. This is<br>
> more logical than my second guess.<br>
<br>
</div>It does not work because the solution operators are global, so to solve<br>
the problem, the iteration must be global.<br>
<div><br>
> In the second one, the system is solved in parallel too. But PETSc function<br>
> redistributes the global sparse matrix A to each of the processors after<br>
> its load is complete. That is to say now each processor may not solve the<br>
> its own partition matrix.<br>
<br>
</div>Hopefully you build the matrix already-distributed. The default<br>
_preconditioner_ is local, but the iteration is global. PETSc does not<br>
"redistribute" the matrix automatically, though if you call<br>
MatSetSizes() and pass PETSC_DECIDE for the local sizes, PETSc will<br>
choose them.<span><font color="#888888"><br>
</font></span></blockquote></div><span><font color="#888888"><br><br clear="all"><span class="HOEnZb"><font color="#888888"><br>-- <br><div><font size="1">Huaibao (Paul) Zhang<br><b><i>Gas Surface Interactions Lab</i></b><br>
</font></div>
<div><font size="1">Department of Mechanical Engineering</font></div>
<div><font size="1">University of Kentucky,</font></div>
<div><font size="1">Lexington,</font></div>
<div style="margin:0px;font-family:Helvetica;font-style:normal;font-variant:normal;font-weight:normal;line-height:normal;font-size-adjust:none;font-stretch:normal"><font size="1">KY, 40506-0503<b><br>Office</b>: 216 Ralph G. Anderson Building<br>
<b>Web</b>:<a href="http://gsil.engineering.uky.edu/" target="_blank"><span style="color:rgb(0,0,153)">gsil.engineering.uky.edu</span></a></font><span><span style="color:rgb(0,0,153)"></span></span></div>
<div style="color:rgb(255,102,102)"><font size="1"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="font-family:Tahoma"><span style="font-family:Verdana"><font face="Helvetica"><span></span></font></span></span></span></span></span></span></span></font></div>
</font></span></font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></blockquote></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener
</font></span></div></div>
</blockquote></div><br></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener
</div></div>