<p>Remember that the code has a different B on every process. You can use a viewer on COMM_SELF, but you should send it to different files so the output isn't all mixed together.</p>
<div class="gmail_quote">On Sep 1, 2011 1:33 PM, "Likun Tan" <<a href="mailto:likunt@andrew.cmu.edu">likunt@andrew.cmu.edu</a>> wrote:<br type="attribution">> Thank you very much.<br>> <br>> With the command:<br>
> for(j=0; j<n; j++)<br>> {<br>> for(i=0; i<M; i++)<br>> {<br>> MatSetValues(B, 1, &i, 1, &j, &Value, INSERT_VALUES);<br>> }<br>> }<br>> This only defines the columns from 0 to n-1? How exactly MPI_Scan()<br>
> getting used here?<br>> <br>> And when i try to print out the result with<br>> <br>> PetscViewer viewer;<br>> PetscViewerSetFormat(viewer, PETSC_VIEWER_ASCII_MATLAB);<br>> MatView(B, viewer);<br>> <br>
> I can only see the columns from 0 to n-1. Maybe there is still problem in<br>> computing B.<br>> <br>> Many thanks,<br>> Likun<br>> <br>> On Thu, September 1, 2011 1:54 pm, Jed Brown wrote:<br>>> On Thu, Sep 1, 2011 at 12:45, Likun Tan <<a href="mailto:likunt@andrew.cmu.edu">likunt@andrew.cmu.edu</a>> wrote:<br>
>><br>>><br>>>> I still have some confusions. When computing B before MPI_Scan(), could<br>>>> i compute the values in parallel? After using MPI_Scan(), does that mean<br>>>> the columns of B will be gathered in one processor?<br>
>>><br>>><br>>> No, the scan is just computing the start column for a given process.<br>>><br>>><br>>><br>>>><br>>>> Here are main steps i took based on your suggestion,<br>
>>><br>>>><br>>>> PetscSplitOwnership(PETSC_COMM_WORLD, &n, &N);<br>>>><br>>>><br>>><br>>> Add these two lines here:<br>>><br>>><br>>> MPI_Scan(&n, &cstart, 1, MPIU_INT, MPI_SUM, PETSC_COMM_WORLD);<br>
>> cstart -= n;<br>>><br>>><br>>>> MatCreateSeqDense(PETSC_COMM_SELF, M, n, PETSC_NULL, &B);<br>>>> MatCreateSeqDense(PETSC_COMM_SELF, M, M, PETSC_NULL, &A);<br>>>><br>
>>><br>>><br>>> Good<br>>><br>>><br>>><br>>>> MatCreateSeqDense(PETSC_COMM_SELF, M, N, PETSC_NULL, &x);<br>>>><br>>>><br>>><br>>> Replace with<br>
>><br>>><br>>> MatDuplicate(B,MAT_DO_NOT_COPY_VALUES,&x);<br>>><br>>><br>>><br>>>> for(j=0; j<N; j++)<br>>>><br>>><br>>> change the loop bounds here:<br>
>><br>>> for(j=0; j<n; j++)<br>>><br>>><br>>>> {<br>>>> for(i=0; i<M; i++) {<br>>>><br>>>><br>>><br>>> Good, now compute value as the value that goes in (i,cstart+j).<br>
>><br>>><br>>> MatSetValues(B, 1, &i, 1, &j, &value, INSERT_VALUES);<br>>><br>>>> }<br>>>> }<br>>>> MatAssemblyBegin(...);<br>>>> MatAssemblyEnd(...)<br>
>>><br>>>><br>>><br>>> This part is correct.<br>>><br>>><br>>><br>>>> MPI_Scan(&n, &cstart, 1, MPIU_INT, MPI_SUM, PETSC_COMM_WORLD);<br>>>> cstart -= n;<br>
>>><br>>><br>>> We did this already.<br>>><br>>><br>>><br>>>><br>>>> MatConvert(...);<br>>>> MatCholeskyFactor(...);<br>>>> MatMatSolve(...);<br>>>><br>
>>><br>>><br>>> Yes.<br>>><br>>><br>>><br>>> You can gather the matrix x onto all processes if you need the whole<br>>> result everywhere, but for performance reasons if you scale further, you<br>
>> should avoid it if possible.<br>>><br>> <br>> <br>> <br>> <br></div>