[petsc-users] Very poor speed up performance

Yongjun Chen yjxd.chen at gmail.com
Thu Dec 23 02:28:48 CST 2010


Matt, Jed, thanks a lot for the discussions. Since the ordering could
minimizing the bandwidth, I think it is really worth to have a try with the
matrix partitioning / ordering. If there is a factor two of increase in the
flop rate, that's quite promising!


On Thu, Dec 23, 2010 at 3:32 AM, Jed Brown <jed at 59a2.org> wrote:

> I disagree, there is easily a factor of two in flop/s between a naive
> ordering (e.g. hierarchical by node type in a finite element method) and a
> good low-bandwidth ordering.
>
> This is in the FUN3D papers and still true today, in my experience.
>
> Incomplete factorization is also very order dependent, as you note.
>
> Jed
>
> On Dec 22, 2010 5:03 PM, "Matthew Knepley" <knepley at gmail.com> wrote:
>
> On Wed, Dec 22, 2010 at 10:11 AM, Yongjun Chen <yjxd.chen at gmail.com>
> wrote:
>
> >
> > On Wed, Dec 22, 2010 at 6:53 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> >>
> >> On Wed, 22 De...
> 1) To see a large gain, the ordering you start with would have to be very
> bad. Maybe it is. These
>     orderings try to minimize bandwidth, which means minimize communication
> in the MatMult.
>
> 2) If you use incomplete facotrization, the ordering can have a large
> effect on conditioning, so
>     number of iterations, which does not improve scalability. This would
> impact scalability if you
>    use a parallel IC, however all those packages reorder your matrix
> already.
>
> In short, I suspect this will not help a lot, except maybe with
> conditioning, which is what I was refering to in the quote.
>
>     Matt
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more...
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20101223/6b3f076c/attachment-0001.htm>


More information about the petsc-users mailing list