[petsc-dev] [Fortran] MatLoad() error: MatLoad is not supported for type: mpiaij!

Jed Brown jed at 59A2.org
Fri Sep 3 11:55:50 CDT 2010


On Fri, 03 Sep 2010 17:28:55 +0200, Jed Brown <jed at 59A2.org> wrote:
>   FAST -O1: 2.0 seconds
>   FAST -O0: 5 seconds
>   SLOW -O1: 2.3 seconds
>   SLOW -O0: more than an hour

The issue here is that my -O0 cases above also had PETSC_USE_DEBUG set,
which was sorting a problem-sized (2M) element array as a consistency
check in AOCreateBasic.  In the 102^3 case, it didn't trigger any bad
behavior from quicksort, but in the 202x102x102 case, it degraded to
quadratic (and eventually overflowed the stack, despite choosing the
median-3 pivot).

I have pushed a new quicksort which avoids this bad behavior for all the
inputs I tried (and is faster than before on all of these inputs as
well).  Here is for the 202x102x102 test case with the new
implementation:

Debug:

MatAssemblyBegin       8 1.0 5.3647e-0118.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  2  0  0  0  6   2  0  0  0  8     0
MatAssemblyEnd         8 1.0 1.3259e+00 1.0 0.00e+00 0.0 2.0e+01 2.9e+04 6.3e+01  8  0 38  0 31   8  0 38  0 42     0
MatGetSubMatrice       2 1.0 2.0135e+00 1.0 0.00e+00 0.0 2.0e+01 9.6e+06 1.0e+01 12  0 38 51  5  12  0 38 51  7     0
MatLoad                1 1.0 5.0351e+00 1.0 0.00e+00 0.0 2.1e+01 9.0e+06 4.7e+01 30  0 40 50 23  30  0 40 50 31     0
MatView                1 1.0 5.0372e+00 1.1 0.00e+00 0.0 1.9e+01 9.9e+06 2.9e+01 28  0 37 50 14  28  0 37 50 19     0

Optimized:

MatAssemblyBegin       8 1.0 3.2006e-01 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  2  0  0  0  9   2  0  0  0 15     0
MatAssemblyEnd         8 1.0 8.3971e-01 1.0 0.00e+00 0.0 2.0e+01 2.9e+04 3.6e+01  9  0 38  0 26   9  0 38  0 44     0
MatGetSubMatrice       2 1.0 9.9702e-01 1.0 0.00e+00 0.0 2.0e+01 9.6e+06 1.0e+01 10  0 38 51  7  10  0 38 51 12     0
MatLoad                1 1.0 2.0991e+00 1.0 0.00e+00 0.0 2.1e+01 9.0e+06 2.6e+01 21  0 40 50 19  21  0 40 50 32     0
MatView                1 1.0 1.7195e+00 1.1 0.00e+00 0.0 1.9e+01 9.9e+06 1.9e+01 17  0 37 50 14  17  0 37 50 23     0


As a longer term matter, it's still possible to provide an input that
degrades to O(n^2).  Does PETSc really want to stick with quicksort, or
would it be worth switching to e.g. heapsort to avoid this?  It's
usually not a performance-sensitive kernel for our use cases, so I'm not
wild about going to a more sophisticated sort to eke out a few more
percent.

I have not changed any of the sort-with versions.

Jed



More information about the petsc-dev mailing list