MatMult_Scatter woes

Barry Smith bsmith at mcs.anl.gov
Tue Apr 7 21:02:09 CDT 2009


    Jed,

     Please remind me of what the simplest fix is? Can we just zero  
the vector before the scatter for the MatMult and MatMultTranspose?
What about for MatMultAdd and MatMultTransposeAdd, does anything need  
to be done?

    Barry

Let's no screw around with any thing complicated (and hence likely to  
be buggy).

On Mar 31, 2009, at 8:31 AM, Jed Brown wrote:

> On Mon 2009-03-30 21:28, Barry Smith wrote:
>>
>> On Mar 30, 2009, at 7:48 PM, Matthew Knepley wrote:
>>
>>> On Mon, Mar 30, 2009 at 6:13 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>>
>>>  I don't think square has anything to do with it. It depends on
>>> whether all the entries in y[i] are hit or if some are skipped, this
>>> depends exactly on the scatter.
>>>
>>>  The first time the mult is called, duplicate x and y temporarility,
>>> fill a xdup with all 1s. Do the scatter, locate all entries of ydup
>>> that are zero, keep a list of these.
>>> destroy xdup, ydup.
>>>
>>>  Now do the MatMult() by zeroing all the entires in the list and  
>>> then
>>> do the scatter, now you have an efficient and correct MatMult. The
>>> drawback of course is the setup time, but in fact that cost is  
>>> only one
>>> scatter (MatMult()) so is likely worth it.
>>>
>>> One could do a similar setup for the transpose?
>>>
>>> I must misunderstand you. The problem above is that INSERT_VALUES is
>>> used, but when two input locations
>>> map to the same output location, they should be added.
>>
>> No, when two map to the same output the scatter is not well defined  
>> and
>> hence the multiply is not defined.
>> When building the scatter we could check for multiple inputs going to
>> the same output but we don't, it will end up
>> just putting one value in and then the other and depending on the  
>> order
>> of the message passing it could be different.
>
> There are two issues, the output index set may be not be injective
> (multiple inputs map to the same output) or the output index set may  
> not
> be surjective (not every entry of the output vector is hit).
>
> When the output index set is not injective, the scatter is ill-defined
> with INSERT_VALUES, but is well-defined with ADD_VALUES.  It is not  
> okay
> to disallow such scatters, the transpose of the global-to-local  
> scatter
> (_p_DA::gtol) is necessarily this way.
>
> The (forward) scatter is well-defined with INSERT_VALUES if (and only
> if) ymax <= 1 after
>
>  VecSet(x,1);
>  VecSet(y,0);
>  VecScatterBegin(scat,x,y,ADD_VALUES,SCATTER_FORWARD);
>  VecScatterEnd(scat,x,y,ADD_VALUES,SCATTER_FORWARD);
>  VecMin(y,&ymin);
>  VecMax(y,&ymax);
>
> If this is not satisfied, then there is no choice but to use the
> implementation
>
>  VecSet(y,0);
>  VecScatterBegin(scat,x,y,ADD_VALUES,SCATTER_FORWARD);
>  VecScatterEnd(scat,x,y,ADD_VALUES,SCATTER_FORWARD);
>
> If ymin == 0, the index set is not surjective so the scatter is not a
> "matrix" unless these entries (the left null space) are set to 0.  In
> this case, we could store the indices of the zero entries in y and do
> the scatter as
>
>  VecScatterBegin(scat,x,y,INSERT_VALUES,SCATTER_FORWARD);
>  VecGetArray(y,&yy);
>  for (i=0; i<scat->to_nnullidx; i++) yy[scat->to_nullidx[i]] = 0;
>  VecRestoreArray(y,&yy);
>  VecScatterEnd(scat,x,y,INSERT_VALUES,SCATTER_FORWARD);
>
> but this will only win if the number of such entries is small compared
> to the size of y.  If the number of zero entries is large, then I  
> don't
> think we can beat
>
>  VecSet(y,0);
>  VecScatterBegin(scat,x,y,INSERT_VALUES,SCATTER_FORWARD);
>  VecScatterEnd(scat,x,y,INSERT_VALUES,SCATTER_FORWARD);
>
>
> We can do the same thing for the transpose.
>
> So the question is, how much should we optimize this operation?  Is it
> necessary to provide something like
>
>  typedef enum {VECSCATTER_IS_INJECTIVE,VECSCATTER_IS_SURJECTIVE}  
> VecScatterOption;
>
>  PetscErrorCode  
> VecScatterSetOption 
> (VecScatter,Vec,Vec,ScatterMode,VecScatterOption,PetscTruth);
>
> which would (analogous to ISSetPermutation()) test whether you are
> telling the truth when in debug mode and believe you in optimized  
> mode?
> This seems excessive since it's fairly unlikely that the cost of the
> VecSet(y,0), ADD_VALUES will be significant to applications.  Who is
> using MATSCATTER anyway (the type name wasn't set prior to my commit
> 6d3d60abcf7d so it hasn't worked in a while)?
>
> Jed




More information about the petsc-dev mailing list