[petsc-dev] sor smoothers
Mark F. Adams
mfadams at lbl.gov
Tue Aug 13 13:17:29 CDT 2013
>
>>
>> MatMult 2 1.0 1.1801e-02 1.0 1.16e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 10 0 0 0 0 10 0 0 0 981
>> MatSOR 3 1.0 4.6818e-02 1.0 1.78e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 16 0 0 0 0 16 0 0 0 380
>>
>> Thus we see that we save all of the MatMult time which is 2 units of the 5 units needed with SOR in terms of flops computed so 40% of the work but only 20% of the time.
>>
>> On the post-smooth of the multigrid there is a nonzero initial guess eisenstat does
>>
>> if (nonzero) {
>> ierr = VecCopy(x,eis->b[pc->presolvedone-1]);CHKERRQ(ierr);
>> ierr = MatSOR(eis->A,eis->b[pc->presolvedone-1],eis->omega,SOR_APPLY_UPPER,0.0,1,1,x);CHKERRQ(ierr);
>>
>> so an extra .5 work unit
>>
>> while Chebychev does the matrix vector product to get the initial residual so
>>
>> Eisenstat is 3 units + .5 unit + 1 unit = 4.5 units
>> SOR 5 units + 1 unit = 6 units
>>
>> so for combined pre and post smooth Eisenstat/SOR = 7.5/11 work units
>
> I think that is right, and indeed, that looks like enough benefit to
> justify converting the matrix format.
Just to be clear. The current eisenstat code (MatSOR) uses a standard AIJ matrix (obviously) but applies SOR with the U or L terms and so has some logic to skip stuff (e.g., skip L+D when processing U). If we have native U and L matrices then we should be able to recover most of the ~2x performance penalty that Barry is showing.
If I'm on the right page then we would probably want this new matrix to have a MatMult that applies U & D & L in one shot. (It might be good to fold this together if performance is limited by cache misses on the source vector.)
More information about the petsc-dev
mailing list