[petsc-users] zero pattern of result of matmatmult

Wed Sep 18 07:06:40 CDT 2013

On Sep 18, 2013, at 3:00 AM, Frederik Treue <frtr at fysik.dtu.dk> wrote:

> On Tue, 2013-09-17 at 07:24 -0700, Jed Brown wrote:
>> Frederik Treue <frtr at fysik.dtu.dk> writes:
>> 
>>> On Tue, 2013-09-17 at 08:15 -0500, Barry Smith wrote:
>>>>    Are you calling MatSetValuesStencil() for each single entry or once per row or block row? Row or block row will be faster than for each entry.
>>> once per entry - this I should definitely improve. What's the logic
>>> behind row or block rows being faster? The fewer calls to
>>> MatSetValuesStencil, the better?
>> 
>> It shares index translation and the call stack.
> 
> Just to be sure: If I could make the building of the matrix in one go
> (pr. mpi process), ie. give a list of stencils from all the points
> handled by the process, and then make one MatSetValuesStencil, would
> this always be faster? Assuming there is no memory concerns.

   No, no, no, do not do that!  Just put in all the values at the same time that are naturally calculated together. Do not try to gather a big bunch of values and put them in together. For finite differences normally it is natural to have code that computes all values in a row together so put them in the matrix with one call to MatSetValuesStencil.

   Barry

> 
>>> PS. FYI: My stencil is essentially a standard 2D, 9-point box stencil,
>>> but the values in all rows are different (due, among other things, to
>>> the geometric tensor).
>> 
>> That's normal.  Is the geometry changing?
> 
> Both that, and my operator depends explicitly on the fields, which
> obviously do change.
>> 
>>>      ##########################################################
>>>      #                                                        #
>>>      #                          WARNING!!!                    #
>>>      #                                                        #
>>>      #   This code was compiled with a debugging option,      #
>>>      #   To get timing results run ./configure                #
>>>      #   using --with-debugging=no, the performance will      #
>>>      #   be generally two or three times faster.              #
>>>      #                                                        #
>>>      ##########################################################
>> 
>> We weren't joking.
>> 
>> You should configure an optimized build using a different value of
>> PETSC_ARCH so that you can easily switch back and forth between the
>> debug and optimized versions.  Everything will get faster, but
>> especially insertion.
> 
> Ah, so I can't assume that because version 1 of my code is X times
> faster than version 2, measured using a debug version of petsc, this
> will also hold for the production version of petsc... Thanks for the
> heads up :)
> 
> /Frederik
> 
> PS. Barry, the constant parts of my operator requires no calculation at
> all, but thanks for the info anyway.
>