[codes-ross-users] CODES LP Design Question

Jenkins, Jonathan P. jenkins at mcs.anl.gov
Thu May 14 14:43:11 CDT 2015


The callback functionality you describe has been used in a few places.
Notably, the "resource" LP in codes-base implements such a method (see
codes/resource-lp.h, resource_lp_* functions). It's not the prettiest, but
it gets the job done. Generalizing the concept and making it a bit easier
to work with in codes has been in the back of my mind for a while, but I
haven't done anything with it.

John

On 5/14/15, 1:51 PM, "Phil Carns" <carns at mcs.anl.gov> wrote:

>What I described in this thread turned out to be a bad idea.
>
>The problem with handing off a function pointer is that the target LP
>(thing2 in the example below) might not be on the same process as the
>caller (thing1), so in the general case it isn't safe to let it hang
>onto a function pointer to issue the completion event.
>
>The lsm/local-storage-model API has this same issue (a higher level
>models want to hand control off to a disk model and then just get
>proceed with with its own work once the disk access is complete).  LSM
>solves this safely from a technical point of view by providing functions
>for the caller to allocate an appropriate event, and letting the caller
>specify it's own event struct that will be handed back on completion.
>This is safe regardless of how your LPs are organized, but it still bugs
>me a little, though, because a) it looks burly in the code and b) the
>target LP doesn't have a way to pass results back in the completion event.
>
>Anyway, I'll putter around with this and share if I come up with
>something cleaner, but I wanted to at least respond to this thread as a
>warning in the mean time :)
>
>thanks,
>-Phil
>
>On 04/08/2015 10:44 AM, Carns, Philip H. wrote:
>> I've never tried to modularize steps like this in a single LP, but it is
>> a little similar to some cross-LP scenarios I've run into before.  Here
>> comes a long-winded example :)
>>
>> Imagine you have two LPs (that run on the same simulated node), called
>> thing1 and thing2.  You want the thing1 procedure to invoke thing2, and
>> then for  thing1 to continue once thing2 is done.
>>
>> If I were to do this now I think I would set it up like this. Assume
>> that thing1 and thing2 each have their own .c and .h files.
>>
>> thing2.h has this prototype: void start_thing2(lp, gid,
>>completion_fn_ptr);
>>
>> This function will construct and send an event to the thing2 LP
>> (identified by the gid argument) to kick off whatever it is going to
>> do.  This function will be executed within a thing1 LP event handler,
>> but the definition of the function is in thing2.c.  The purpose of this
>> arrangement is to let thing1 send an event to thing2 without having
>> access to thing2's event structure definition.  As long as you keep the
>> API stable then thing2.c can do whatever it needs to do in terms of
>> refactoring its event structure without perturbing thing1.
>>
>> The lp argument is the lp pointer for the *calling* LP, and is probably
>> needed to allocate a new event.
>>
>> The completion_fn_ptr is a function pointer to a function defined by the
>> caller (thing1 in this case) that sends an event back to the original
>> LP.  This is the mechanism for thing2 to tell thing1 it is done so that
>> thing1 can continue operation.  Since it is an opaque function pointer,
>> thing2 remains oblivious to the event structure definition of thing1 (or
>> even what kind of LP thing1 is).  You can make the completion_fn_ptr
>> signature anything that you like, so if thing2 has some result to report
>> back, then it can be passed as an argument to that function and stuffed
>> into the event that heads back to thing1.
>>
>> Of course you can also add whatever arbitrary arguments you want to
>> start_thing2() to pass input parameters it needs to start whatever
>> thing2 is doing.  Those would get stuffed into the event destined for
>> thing2.
>>
>> Unfortunately I haven't actually done this so I don't have an example.
>> My thinking on how to organize this has evolved over time and I haven't
>> gotten around to trying this particular permutation :)  I've done
>> something very similar except for the function pointer; I have a few
>> examples have an explicitly defined function that jumps back from thing2
>> to thing1.  That works fine, but it is bad news if you want more than
>> one type of LP to be able to call thing2.
>>
>> A related theme here is that I'm a fan (possibly in the minority on
>> this) of splitting up different complex procedures that happen on the
>> same simulated node into separate LPs.  As long as you have them share
>> node resources (NIC, storage, etc.) then the net result is the same in
>> terms of the simulation outcome.  The positive is that you don't have
>> event handler explosion (big switch statements to handle different
>> cases).  The negative is that you have to jump between these LPs by
>> sending events with small timestamps.  Once you start doing that, then
>> parallel conservative execution is practically out of the question; you
>> have to do serial execution or parallel optimistic execution to handle
>> those extra small events in a reasonable way.
>>
>> This might be overkill for your scenario, but I thought I would throw it
>> out there.  I wanted to collect my thoughts on this anyway.  In general
>> I don't think there is any particular way around this without juggling
>> callbacks and such; really the best thing you can hope for right now is
>> to minimize entanglement between different LP types in your code.
>>
>> -Phil
>>
>> On 04/08/2015 06:47 AM, Ross, Robert B. wrote:
>>> I think the way to think about this is that the event payload carries
>>>the state associated with the operation and gets passed around between
>>>LPs as needed to simulate the steps. If you do this right, there
>>>shouldn't be local LP state associated with the operation -- everything
>>>is in the event payload that is needed.
>>>
>>> There might be state at LPs associated with the LP itself, such as
>>>what objects exist or what names are in the namespace. Or not, if you
>>>are just assuming things exist for instance.
>>>
>>> John, Phil, and/or Misbah may need to correct me...of course.
>>>
>>> -- Rob
>>>
>>>> On Apr 7, 2015, at 4:09 PM, Jenkins, Jonathan P.
>>>><jenkins at mcs.anl.gov> wrote:
>>>>
>>>> Hi Joe,
>>>>
>>>> So if I understand right, the context is you want to simulate some
>>>>POSIX
>>>> operations of interest, such as mkdir, and that the client is capable
>>>>of
>>>> performing the stat (the lookup) using purely local information
>>>>(hence the
>>>> self-event)?
>>>>
>>>> In general, it's useful to think of events in these systems as RPC
>>>>calls,
>>>> where the event structure you are setting up contains the parameters,
>>>>and
>>>> there are corresponding return parameters that the calling LP expects
>>>>to
>>>> receive. In this frame of thought, both RPC calls and returns issue a
>>>> discrete event. In your example, you could have your "mkdir" RPC event
>>>> handler perform an additional "lookup/stat" RPC to check the path's
>>>> existence, and upon receiving the "return" event, either make the
>>>> directory or fail.
>>>>
>>>> Unfortunately, event-driven programming more-or-less necessitates
>>>> callback-heavy code, which can be quite awkward in some contexts.
>>>>We've
>>>> talked in the past about ways to at least standardize the way we do
>>>>this
>>>> in the context of CODES, but nothing has materialized as of yet.
>>>>
>>>> Hope that helps. Something tells me that the assumption about clients
>>>> being capable of performing metadata operations without outside
>>>> interaction with e.g. a storage server is not quite right, though I
>>>>could
>>>> be misunderstanding.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>>> On 4/7/15, 3:43 PM, "Joe Scott" <tscott2 at g.clemson.edu> wrote:
>>>>>
>>>>> Hello All,
>>>>>
>>>>> I am having a hard time wrapping myself around the programming
>>>>>paradigm
>>>>> here, and I wonder if you might offer some guidance on a better way
>>>>>to
>>>>> use CODES.
>>>>>
>>>>> So I am trying to process these higher level tasks (POSIX tasks like
>>>>> mkdir) by launching the subevents as separate processes.
>>>>>
>>>>> The specific case that is tying me in knots is a user issuing a
>>>>>mkdir.
>>>>> It launches the mkdir event handler, which needs to perform a lookup
>>>>>on
>>>>> the path of the mkdir.
>>>>>
>>>>> So I need to send an event from this client LP to itself to perform
>>>>>the
>>>>> lookup.  But I also need the lookup, upon completion, to relaunch the
>>>>> mkdir task.
>>>>>
>>>>> Speaking it over with some of my lab mates, they seem to think I am
>>>>> either overthinking it or trying to use the wrong tool for the job.
>>>>>
>>>>> Is this a usecase you guys are familiar with?  Can you shed some
>>>>>light on
>>>>> this situation?
>>>>>
>>>>> I feel like there should be a way to do this without getting into
>>>>> callback/completion function hell.
>>>>>
>>>>> Thanks,
>>>>> Joe Scott
>>>>> Clemson University
>>>>> _______________________________________________
>>>>> codes-ross-users mailing list
>>>>> codes-ross-users at lists.mcs.anl.gov
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>>>> _______________________________________________
>>>> codes-ross-users mailing list
>>>> codes-ross-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>>> _______________________________________________
>>> codes-ross-users mailing list
>>> codes-ross-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>> _______________________________________________
>> codes-ross-users mailing list
>> codes-ross-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>
>_______________________________________________
>codes-ross-users mailing list
>codes-ross-users at lists.mcs.anl.gov
>https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users



More information about the codes-ross-users mailing list