<div class="gmail_quote">On Thu, Sep 1, 2011 at 12:45, Likun Tan <span dir="ltr"><<a href="mailto:likunt@andrew.cmu.edu">likunt@andrew.cmu.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":12l">I still have some confusions. When computing B before MPI_Scan(), could i<br>
compute the values in parallel? After using MPI_Scan(), does that mean the<br>
columns of B will be gathered in one processor?<br></div></blockquote><div><br></div><div>No, the scan is just computing the start column for a given process.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":12l">
<br>
Here are main steps i took based on your suggestion,<br>
<div class="im"><br>
PetscSplitOwnership(PETSC_COMM_WORLD, &n, &N);<br></div></div></blockquote><div><br></div><div>Add these two lines here:</div><div><br></div><div><div><div><div>MPI_Scan(&n, &cstart, 1, MPIU_INT, MPI_SUM, PETSC_COMM_WORLD);</div>
<div>cstart -= n;</div></div></div></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div id=":12l"><div class="im">
MatCreateSeqDense(PETSC_COMM_SELF, M, n, PETSC_NULL, &B);<br>
</div>MatCreateSeqDense(PETSC_COMM_SELF, M, M, PETSC_NULL, &A);<br></div></blockquote><div><br></div><div>Good</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":12l">
MatCreateSeqDense(PETSC_COMM_SELF, M, N, PETSC_NULL, &x);<br></div></blockquote><div><br></div><div>Replace with</div><div><br></div><div>MatDuplicate(B,MAT_DO_NOT_COPY_VALUES,&x);</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":12l">
<div class="im">for(j=0; j<N; j++)<br></div></div></blockquote><div><br></div><div>change the loop bounds here:</div><div><br></div><div>for(j=0; j<n; j++)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":12l"><div class="im">
{<br>
for(i=0; i<M; i++)<br>
{<br></div></div></blockquote><div><br></div><div>Good, now compute value as the value that goes in (i,cstart+j).</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div id=":12l"><div class="im">
</div> MatSetValues(B, 1, &i, 1, &j, &value, INSERT_VALUES);<br>
}<br>
}<br>
MatAssemblyBegin(...);<br>
MatAssemblyEnd(...)<br></div></blockquote><div><br></div><div>This part is correct.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div id=":12l">
<div class="im">MPI_Scan(&n, &cstart, 1, MPIU_INT, MPI_SUM, PETSC_COMM_WORLD);<br>
cstart -= n;<br></div></div></blockquote><div><br></div><div>We did this already.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div id=":12l"><div class="im">
<br>
</div>MatConvert(...);<br>
MatCholeskyFactor(...);<br>
MatMatSolve(...);</div></blockquote></div><br><div>Yes.</div><div><br></div><div><br></div><div>You can gather the matrix x onto all processes if you need the whole result everywhere, but for performance reasons if you scale further, you should avoid it if possible.</div>