[petsc-dev] Bug in MatMatMultTranspose for sequantial AIJ matrices from petsc-dev?
Hong Zhang
hzhang at mcs.anl.gov
Wed Nov 9 11:52:07 CST 2011
Mark,
Unlike what we though, R*A*Rt turns to be more difficult than PtAP
because sparse dot product
is inefficient. Barry's cool idea only works well on A*Rt for your
ex56.c thus far.
We are trying to understand what we get and exploring ....
It is not ready for you dive into R*A*Rt yet :-(
I'll let you informed about our progress.
Hong
On Wed, Nov 9, 2011 at 10:59 AM, Mark F. Adams <mark.adams at columbia.edu> wrote:
>
> On Nov 9, 2011, at 11:02 AM, Barry Smith wrote:
>
>>
>> On Nov 9, 2011, at 7:58 AM, Mark F. Adams wrote:
>>
>>> FYI: I appear to be getting not great flop rates out of these methods on my Mac:
>>
>> Known issue. Do you have alternative algorithms that would crank it up? This is something we are actively working on.
>>
>
> I don't know what you're doing now, Hong mentioned that you had some good ideas for optimizing the code from what we look at together when I was at Argonne.
>
> One generic idea that has come to mind since we last talked, for RAP, is folding the two parts: 1) T = A*P, 2) RAP = R*T, together. I have not looked at this in detail but perhaps instead of computing the whole "T" here, compute parts (a row, an element ...) call each one T_i, and use them right away, RAP += R*T_i, in one big loop and throw them away. This might improve cache performance because T will be high in cache.
>
> It sounds like the serial code is stable now. I will dive into it this week, finally figure out what you are doing exactly, and see if I can come up with any ideas. This is a hard problem and I'm not even sure what fast is here.
>
> Mark
>
>> Barry
>>
>>>
>>> MatMatMult 2 1.0 1.1382e+00 1.0 8.48e+07 1.0 0.0e+00 0.0e+00 4.0e+00 4 1 0 0 2 12 8 0 0 2 75
>>> MatPtAPNumeric 2 1.0 4.3557e+00 1.0 7.82e+08 1.0 0.0e+00 0.0e+00 0.0e+00 14 11 0 0 0 45 74 0 0 0 180
>>> MatTrnMatMult 2 1.0 6.0777e-01 1.0 3.31e+07 1.0 0.0e+00 0.0e+00 8.0e+00 2 0 0 0 4 6 3 0 0 4 55
>>>
>>> KSPSolve 1 1.0 2.9164e+00 1.0 1.57e+09 1.0 0.0e+00 0.0e+00 0.0e+00 10 21 0 0 0 100100 0 0 0 538
>>>
>>> Mark
>>>
>>> On Nov 9, 2011, at 10:47 AM, Hong Zhang wrote:
>>>
>>>>> Hong, please update src/docs/website/documentation/changes/dev.html when you
>>>>> make API changes.
>>>> Done and pushed to petsc-dev.
>>>>
>>>> Hong
>>>>
>>>
>>
>>
>
>
More information about the petsc-dev
mailing list