On Tue, Nov 10, 2009 at 4:51 AM, Jed Brown <span dir="ltr"><<a href="mailto:jed@59a2.org">jed@59a2.org</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="im"><a href="mailto:jarunan@ascomp.ch">jarunan@ascomp.ch</a> wrote:<br>
> Total number of cells is 744872, divided into 40 blocks. In one<br>
> processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec with<br>
> 4 processors. Usually, this routine has no problem with small test case.<br>
> It works the same for one or more than one processors.<br>
<br>
</div>This sounds like incorrect preallocation. Is your PETSc built with<br>
debugging? Debug does some extra integrity checks that don't add<br>
significantly to the time (although other Debug checks do), but it would<br>
be useful to know that they pass. In particular, it checks that your<br>
rows are sorted. If they are not sorted then PETSc's preallocation<br>
would be wrong. (I actually don't think this requirement enables<br>
significantly faster implementation, so I'm tempted to change it to work<br>
correctly with unsorted rows.)<br></blockquote><div><br>I do not think its preallocation per se, since 1 proc is fast. I think that your<br>partition of rows fed to the MatCreate() call does not match what you provide<br>
to MatSetValues() and thus you do a lot of communication in MatAssemblyEnd().<br>There are 2 ways to debug this:<br><br> 1) -log_summary to see where the time is spent<br><br> 2) MatSetOption(A, <b>MAT_NEW_NONZERO_LOCATION_ERR)</b><br>
<br> Matt<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
You can also run with -info |grep malloc, there should be no mallocs in<br>
MatSetValues().<br>
<div class="im"><br>
> in the first iteration.<br>
> Mat Ap<br>
><br>
> call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, istorf_no_ovcell, &<br>
> istorf_no_ovcell, PETSC_DETERMINE, PETSC_DETERMINE, rowind, columnind, &<br>
> A, Ap, ierr)<br>
><br>
> call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)<br>
> call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)<br>
<br>
</div>This assembly is superfluous (but harmless).<br>
<div class="im"><br>
> Does the communication of MatCreateMPIAIJWithArrays() in parallel<br>
> computation cost a lot? What could be the cause that<br>
> MatCreateMPIAIJWithArrays() so slow in the first iteration?<br>
<br>
</div>There is no significant communication, it has to be preallocation.<br>
<font color="#888888"><br>
Jed<br>
<br>
</font></blockquote></div><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener<br>