[petsc-dev] GAMG and custom MatMults in smoothers

Mon Jul 2 10:19:13 CDT 2018

> 2. 7. 2018 v 16:38, Matthew Knepley <knepley at gmail.com>:
> 
> On Mon, Jul 2, 2018 at 9:33 AM Vaclav Hapla <vaclav.hapla at erdw.ethz.ch <mailto:vaclav.hapla at erdw.ethz.ch>> wrote:
> 
> 
>> 2. 7. 2018 v 15:50, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>:
>> 
>> On Mon, Jul 2, 2018 at 8:28 AM Vaclav Hapla <vaclav.hapla at erdw.ethz.ch <mailto:vaclav.hapla at erdw.ethz.ch>> wrote:
>>> 2. 7. 2018 v 15:05, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>:
>>> 
>>> On Mon, Jul 2, 2018 at 7:54 AM Vaclav Hapla <vaclav.hapla at erdw.ethz.ch <mailto:vaclav.hapla at erdw.ethz.ch>> wrote:
>>> 
>>> 
>>>> 2. 7. 2018 v 14:48, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>:
>>>> 
>>>> On Mon, Jul 2, 2018 at 3:48 AM Vaclav Hapla <vaclav.hapla at erdw.ethz.ch <mailto:vaclav.hapla at erdw.ethz.ch>> wrote:
>>>> Barry wrote:
>>>>>   This could get ugly real fast, for example, for vector operations, there may be dozens of named vectors and each one gets its own logging? You'd have to make sure that only the objects you care about get named, is that possible?
>>>>> 
>>>>>    I don't know if there is a good solution within the PETSc logging infrastructure to get what you want but maybe what you propose is the best possible.
>>>> 
>>>> As I suggest, this behavior would be only triggered by a specific option.
>>>> 
>>>> I think there are actually 4 strings which could be used as an event name suffix in log view:
>>>> 1) name
>>>> 2) prefix
>>>> 3) type
>>>> 4) custom string (set by something like PetscObjectSetLogViewSuffix)
>>>> I think the best would be to let user choose by offering -log_view_by_{name,prefix,type,suffix}.
>>>> 
>>>> For example, with -log_view_by_prefix, you could readily distinguish PCTelescope outer and inner apply, because you would see a separate "PCApply (telescope_)" event.
>>>> With -log_view_by_type, you would see PCApply (telescope).
>>>> 
>>>> I think this would be useful because the current class-wide events like MatMult or PCApply aggregate very different operations from which some are for free and some form hotspots.
>>>> 
>>>> 
>>>> Stefano wrote:
>>>>> The issue with this sort of “dynamic” logging is that now PETSc requires PetscLogEvent created during the registration of the class, so that all the ranks in PETSC_COMM_WORLD have the same events registered.
>>>>> What you propose is not generally supported for this specific reason.
>>>>> 
>>>>> Your “log_name” may work if users register their own classes (with their own LogEvents created properly), and currently we don’t have support (maybe I’m wrong) to add an “InitializePackage” method for the users’ registered classes.
>>>> 
>>>> 
>>>> I don't agree. What I suggest is basically an ability to allow automatically created object-wise events, so it _can't_ be managed during the class registration. In presence of respective option, the event would be created during PetscLogEventBegin by taking the class-wide event's name, concatenating the suffix and registering a new event. The event id would be stored in the PetscObject structure.
>>>> 
>>>> 
>>>> Matt wrote:
>>>>> As people have pointed out, this would not work well for Events. However, this is exactly what stages are for.
>>>>> Use separate stages for the different types of MatMult. I did this, for example, when looking at performance
>>>>> on different MG levels.
>>>> 
>>>> Yes, performance on different MG levels is a nice use case. I don't understand how you inject stages into MatMults. To me it's exactly the same problem as with events - you have to define MatMult_custom where you take the original mult and wrap into PetscStageLogPush/Pop and then use MatSetOperation to redefine MatMult. Or do you mean something more elegant?
>>>> 
>>>> You could do that, but usually I think of stages as being structural. I think for your example I would push/pop the stage
>>>> inside your Mat operation wrapper (I don't see why you need another one), and this behavior could be controlled with
>>>> another option so you could turn it off.
>>> 
>>> I meant hierarchies of typically Mats or PCs, where you don't define any custom operations but compose together existing types (which should be promoted I believe). So no "my" wrapper. As I wrote below:
>>> 
>>>>>>  Think e.g. of having additive MATCOMPOSITE wrapping multiplicative MATCOMPOSITE wrapping MATTRANSPOSE wrapping MATAIJ. You want to measure this MATAIJ instance's MatMult separately but you surely don't want to rewrite implementation of MatMult_Transpose or force yourself to use MATSHELL just to hang the events on MatMult*.
>>> 
>>> 
>>> Its not enough to make separate stages for additive MC, multiplicative MC, and MT? If you want stages for every single
>>> combination created dynamically, you can push another stage when each of these combinations is created using GetTag()
>>> or something like that. You could switch between these behaviors with an option.
>> 
>> I'm not sure I understand. Do you mean registering and pushing/popping these stages in the user's code? You can surely call PetscStageLogRegister somewhere after PetscInitialize, but where do you place your PetscStageLogPush/Pop calls?
>> 
>> No. You would create a stage when the MATCOMPOSITE is created (or once when any MATCOMPOSITE is created), and push/pop
>> on application.
> 
> There's surely no problem with creating that stage. But still I don't see how can you push/pop on specific MatMult if it's aggregated together with another MatMults in a higher level MatMult without redefining the latter? And there can be arbitrary number of such levels. If you push/pop your stage in the code calling the top-level MatMult, you can't distinguish different MatMults occurring inside.
> 
> I do not understand what you mean. Each operation would happen in a separate stage. Its time for a specific, simple example
> where you think this would break. This could be coded in 10 minutes.

OK, now I got you. But I didn't mean creating a bunch of tests for any possible hierarchy and running them just to get the separated time. I hoped that any hierarchy specifiable just from options could be flexibly reflected in log view without any coding, additionally with realistic times and counts.

I see a lot of possible benefits. Within PETSc, it's not that obvious with MatMult but think of PCSetUp, PCApply, KSPSetUp, KSPSolve. Think of your talk's slide-wide example of composed KSPs and PCs specified just from options. Don't you really think it would be nice to be able to get separated counts, times, flops, MPI messages and so on per user-specified stages flexibly?

In my opinion, flexibility of the log view lags behind the flexibility of always advertised composable solvers. With my proposal, you would get it just by specifying -log_view_by_prefix.

>  
> And I don't understand what's in this context the advantage of using stages (which are typically registered by user and from his perspective they are generally spanning multiple different operations/function calls) against events which are precisely meant for single operations/function calls (at least I hope so based on thousands of use cases in PETSc itself).
> 
> Events are semantically associated with given operations. Stages are semantically associated with the context of an operation.
> So defining a bunch of events for MatMult does not make sense to me, whereas defining a bunch of stages for different MatMults
> definitely does.
>  
> You want me to create a stage which is shorter than then the wrapping event which is I think much uglier than what I propose :-)
> 
> I don't. I think it follows exactly the toplevel design.

OK, I really don't like it but don't want to argue about aesthetics. But a hard argument can be e.g. that only one stage can be active at a time (hence Push/Pop name) so what I wanted to have is simply not doable your way. From this reason there's probably no hard-wired stage anywhere in PETSc.

Vaclav

> 
>   Thanks,
> 
>      Matt
>  
> Vaclav
> 
>> 
>>   Matt
>>  
>> Thanks
>> 
>> Vaclav
>> 
>>> 
>>> The reason I think this is preferable is that we do not mess with any logging infrastructure, we just use stages inside of other objects.
>>> 
>>>   Thanks,
>>> 
>>>      Matt
>>>  
>>> Thanks
>>> 
>>> Vaclav
>>> 
>>> 
>>>> 
>>>>   Matt
>>>>  
>>>> Thanks
>>>> 
>>>> Vaclav
>>>> 
>>>> 
>>>> 
>>>>> 29. 6. 2018 v 22:42, Smith, Barry F. <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jun 29, 2018, at 9:33 AM, Vaclav Hapla <vaclav.hapla at erdw.ethz.ch <mailto:vaclav.hapla at erdw.ethz.ch>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> 22. 6. 2018 v 17:47, Smith, Barry F. <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 22, 2018, at 5:43 AM, Pierre Jolivet <pierre.jolivet at enseeiht.fr <mailto:pierre.jolivet at enseeiht.fr>> wrote:
>>>>>>>> 
>>>>>>>> Hello,
>>>>>>>> I’m solving a system using a MATSHELL and PCGAMG.
>>>>>>>> The MPIAIJ Mat I’m giving to GAMG has a specific structure (inherited from the MATSHELL) I’d like to exploit during the solution phase when the smoother on the finest level is doing MatMults.
>>>>>>>> 
>>>>>>>> Is there some way to:
>>>>>>>> 1) decouple in -log_view the time spent in the MATSHELL MatMult and in the smoothers MatMult
>>>>>>> 
>>>>>>> You can register a new event and then inside your MATSHELL MatMult() call PetscLogEventBegin/End on your new event.
>>>>>>> 
>>>>>>>  Note that the MatMult() like will still contain the time for your MatShell mult so you will need to subtract it off to get the time for your non-shell matmults.
>>>>>> 
>>>>>> In PERMON, we sometimes have quite complicated hierarchy of wrapped matrices and want to measure MatMult{,Transpose,Add,TransposeAdd} separately for particular ones. Think e.g. of having additive MATCOMPOSITE wrapping multiplicative MATCOMPOSITE wrapping MATTRANSPOSE wrapping MATAIJ. You want to measure this MATAIJ instance's MatMult separately but you surely don't want to rewrite implementation of MatMult_Transpose or force yourself to use MATSHELL just to hang the events on MatMult*.
>>>>>> 
>>>>>> We had a special wrapper type just adding some prefix to the events for the given object but this is not nice. What about adding a functionality to PetscLogEventBegin/End that would distinguish based on the first PetscObject's name or option prefix? Of course optionally not to break guys relying on current behavior - e.g. under something like -log_view_by_name. To me it's quite an elegant solution working for any PetscObject and any event.
>>>>> 
>>>>>   This could get ugly real fast, for example, for vector operations, there may be dozens of named vectors and each one gets its own logging? You'd have to make sure that only the objects you care about get named, is that possible?
>>>>> 
>>>>>    I don't know if there is a good solution within the PETSc logging infrastructure to get what you want but maybe what you propose is the best possible.
>>>>> 
>>>>>   Barry
>>>>> 
>>>>>> 
>>>>>> I can do that if I get some upvotes.
>>>>>> 
>>>>>> Vaclav
>>>>>> 
>>>>>>> 
>>>>>>>> 2) hardwire a specific MatMult implementation for the smoother on the finest level
>>>>>>> 
>>>>>>> In the latest release you do MatSetOperation() to override the normal matrix vector product with anything else you want. 
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks in advance,
>>>>>>>> Pierre
>>>>>>>> 
>>>>>>>> PS : here is what I have right now,
>>>>>>>> MatMult              118 1.0 1.0740e+02 1.6 1.04e+13 1.6 1.7e+06 6.1e+05 0.0e+00 47100 90 98  0 47100 90 98  0 81953703
>>>>>>>> […]
>>>>>>>> PCSetUp                2 1.0 8.6513e+00 1.0 1.01e+09 1.7 2.6e+05 4.0e+05 1.8e+02  5  0 14 10 66   5  0 14 10 68 94598
>>>>>>>> PCApply               14 1.0 8.0373e+01 1.1 9.06e+12 1.6 1.3e+06 6.0e+05 2.1e+01 45 87 72 78  8 45 87 72 78  8 95365211 // I’m guessing a lot of time here is being wasted in doing inefficient MatMults on the finest level but this is only speculation
>>>>>>>> 
>>>>>>>> Same code with -pc_type none -ksp_max_it 13,
>>>>>>>> MatMult               14 1.0 1.2936e+01 1.7 1.35e+12 1.6 2.0e+05 6.1e+05 0.0e+00 15100 78 93  0 15100 78 93  0 88202079
>>>>>>>> 
>>>>>>>> The grid itself is rather simple (two levels, extremely aggressive coarsening),
>>>>>>>> type is MULTIPLICATIVE, levels=2 cycles=v
>>>>>>>> KSP Object: (mg_coarse_) 1024 MPI processes
>>>>>>>> linear system matrix = precond matrix:
>>>>>>>>   Mat Object: 1024 MPI processes
>>>>>>>>     type: mpiaij
>>>>>>>>     rows=775, cols=775
>>>>>>>>     total: nonzeros=1793, allocated nonzeros=1793
>>>>>>>> 
>>>>>>>> linear system matrix followed by preconditioner matrix:
>>>>>>>> Mat Object: 1024 MPI processes
>>>>>>>> type: shell
>>>>>>>> rows=1369307136, cols=1369307136
>>>>>>>> Mat Object: 1024 MPI processes
>>>>>>>> type: mpiaij
>>>>>>>> rows=1369307136, cols=1369307136
>>>>>>>> total: nonzeros=19896719360, allocated nonzeros=19896719360
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180702/d85de312/attachment-0001.html>