[codes-ross-users] CODES LP Design Question

Thu May 14 15:11:38 CDT 2015

I will take a look at the resource-lp stuff and see if I can make sense of it!  Thanks for the pointer.

Joe

> On May 14, 2015, at 3:43 PM, Jenkins, Jonathan P. <jenkins at mcs.anl.gov> wrote:
> 
> The callback functionality you describe has been used in a few places.
> Notably, the "resource" LP in codes-base implements such a method (see
> codes/resource-lp.h, resource_lp_* functions). It's not the prettiest, but
> it gets the job done. Generalizing the concept and making it a bit easier
> to work with in codes has been in the back of my mind for a while, but I
> haven't done anything with it.
> 
> John
> 
> On 5/14/15, 1:51 PM, "Phil Carns" <carns at mcs.anl.gov> wrote:
> 
>> What I described in this thread turned out to be a bad idea.
>> 
>> The problem with handing off a function pointer is that the target LP
>> (thing2 in the example below) might not be on the same process as the
>> caller (thing1), so in the general case it isn't safe to let it hang
>> onto a function pointer to issue the completion event.
>> 
>> The lsm/local-storage-model API has this same issue (a higher level
>> models want to hand control off to a disk model and then just get
>> proceed with with its own work once the disk access is complete).  LSM
>> solves this safely from a technical point of view by providing functions
>> for the caller to allocate an appropriate event, and letting the caller
>> specify it's own event struct that will be handed back on completion.
>> This is safe regardless of how your LPs are organized, but it still bugs
>> me a little, though, because a) it looks burly in the code and b) the
>> target LP doesn't have a way to pass results back in the completion event.
>> 
>> Anyway, I'll putter around with this and share if I come up with
>> something cleaner, but I wanted to at least respond to this thread as a
>> warning in the mean time :)
>> 
>> thanks,
>> -Phil
>> 
>> On 04/08/2015 10:44 AM, Carns, Philip H. wrote:
>>> I've never tried to modularize steps like this in a single LP, but it is
>>> a little similar to some cross-LP scenarios I've run into before.  Here
>>> comes a long-winded example :)
>>> 
>>> Imagine you have two LPs (that run on the same simulated node), called
>>> thing1 and thing2.  You want the thing1 procedure to invoke thing2, and
>>> then for  thing1 to continue once thing2 is done.
>>> 
>>> If I were to do this now I think I would set it up like this. Assume
>>> that thing1 and thing2 each have their own .c and .h files.
>>> 
>>> thing2.h has this prototype: void start_thing2(lp, gid,
>>> completion_fn_ptr);
>>> 
>>> This function will construct and send an event to the thing2 LP
>>> (identified by the gid argument) to kick off whatever it is going to
>>> do.  This function will be executed within a thing1 LP event handler,
>>> but the definition of the function is in thing2.c.  The purpose of this
>>> arrangement is to let thing1 send an event to thing2 without having
>>> access to thing2's event structure definition.  As long as you keep the
>>> API stable then thing2.c can do whatever it needs to do in terms of
>>> refactoring its event structure without perturbing thing1.
>>> 
>>> The lp argument is the lp pointer for the *calling* LP, and is probably
>>> needed to allocate a new event.
>>> 
>>> The completion_fn_ptr is a function pointer to a function defined by the
>>> caller (thing1 in this case) that sends an event back to the original
>>> LP.  This is the mechanism for thing2 to tell thing1 it is done so that
>>> thing1 can continue operation.  Since it is an opaque function pointer,
>>> thing2 remains oblivious to the event structure definition of thing1 (or
>>> even what kind of LP thing1 is).  You can make the completion_fn_ptr
>>> signature anything that you like, so if thing2 has some result to report
>>> back, then it can be passed as an argument to that function and stuffed
>>> into the event that heads back to thing1.
>>> 
>>> Of course you can also add whatever arbitrary arguments you want to
>>> start_thing2() to pass input parameters it needs to start whatever
>>> thing2 is doing.  Those would get stuffed into the event destined for
>>> thing2.
>>> 
>>> Unfortunately I haven't actually done this so I don't have an example.
>>> My thinking on how to organize this has evolved over time and I haven't
>>> gotten around to trying this particular permutation :)  I've done
>>> something very similar except for the function pointer; I have a few
>>> examples have an explicitly defined function that jumps back from thing2
>>> to thing1.  That works fine, but it is bad news if you want more than
>>> one type of LP to be able to call thing2.
>>> 
>>> A related theme here is that I'm a fan (possibly in the minority on
>>> this) of splitting up different complex procedures that happen on the
>>> same simulated node into separate LPs.  As long as you have them share
>>> node resources (NIC, storage, etc.) then the net result is the same in
>>> terms of the simulation outcome.  The positive is that you don't have
>>> event handler explosion (big switch statements to handle different
>>> cases).  The negative is that you have to jump between these LPs by
>>> sending events with small timestamps.  Once you start doing that, then
>>> parallel conservative execution is practically out of the question; you
>>> have to do serial execution or parallel optimistic execution to handle
>>> those extra small events in a reasonable way.
>>> 
>>> This might be overkill for your scenario, but I thought I would throw it
>>> out there.  I wanted to collect my thoughts on this anyway.  In general
>>> I don't think there is any particular way around this without juggling
>>> callbacks and such; really the best thing you can hope for right now is
>>> to minimize entanglement between different LP types in your code.
>>> 
>>> -Phil
>>> 
>>> On 04/08/2015 06:47 AM, Ross, Robert B. wrote:
>>>> I think the way to think about this is that the event payload carries
>>>> the state associated with the operation and gets passed around between
>>>> LPs as needed to simulate the steps. If you do this right, there
>>>> shouldn't be local LP state associated with the operation -- everything
>>>> is in the event payload that is needed.
>>>> 
>>>> There might be state at LPs associated with the LP itself, such as
>>>> what objects exist or what names are in the namespace. Or not, if you
>>>> are just assuming things exist for instance.
>>>> 
>>>> John, Phil, and/or Misbah may need to correct me...of course.
>>>> 
>>>> -- Rob
>>>> 
>>>>> On Apr 7, 2015, at 4:09 PM, Jenkins, Jonathan P.
>>>>> <jenkins at mcs.anl.gov> wrote:
>>>>> 
>>>>> Hi Joe,
>>>>> 
>>>>> So if I understand right, the context is you want to simulate some
>>>>> POSIX
>>>>> operations of interest, such as mkdir, and that the client is capable
>>>>> of
>>>>> performing the stat (the lookup) using purely local information
>>>>> (hence the
>>>>> self-event)?
>>>>> 
>>>>> In general, it's useful to think of events in these systems as RPC
>>>>> calls,
>>>>> where the event structure you are setting up contains the parameters,
>>>>> and
>>>>> there are corresponding return parameters that the calling LP expects
>>>>> to
>>>>> receive. In this frame of thought, both RPC calls and returns issue a
>>>>> discrete event. In your example, you could have your "mkdir" RPC event
>>>>> handler perform an additional "lookup/stat" RPC to check the path's
>>>>> existence, and upon receiving the "return" event, either make the
>>>>> directory or fail.
>>>>> 
>>>>> Unfortunately, event-driven programming more-or-less necessitates
>>>>> callback-heavy code, which can be quite awkward in some contexts.
>>>>> We've
>>>>> talked in the past about ways to at least standardize the way we do
>>>>> this
>>>>> in the context of CODES, but nothing has materialized as of yet.
>>>>> 
>>>>> Hope that helps. Something tells me that the assumption about clients
>>>>> being capable of performing metadata operations without outside
>>>>> interaction with e.g. a storage server is not quite right, though I
>>>>> could
>>>>> be misunderstanding.
>>>>> 
>>>>> Thanks,
>>>>> John
>>>>> 
>>>>>> On 4/7/15, 3:43 PM, "Joe Scott" <tscott2 at g.clemson.edu> wrote:
>>>>>> 
>>>>>> Hello All,
>>>>>> 
>>>>>> I am having a hard time wrapping myself around the programming
>>>>>> paradigm
>>>>>> here, and I wonder if you might offer some guidance on a better way
>>>>>> to
>>>>>> use CODES.
>>>>>> 
>>>>>> So I am trying to process these higher level tasks (POSIX tasks like
>>>>>> mkdir) by launching the subevents as separate processes.
>>>>>> 
>>>>>> The specific case that is tying me in knots is a user issuing a
>>>>>> mkdir.
>>>>>> It launches the mkdir event handler, which needs to perform a lookup
>>>>>> on
>>>>>> the path of the mkdir.
>>>>>> 
>>>>>> So I need to send an event from this client LP to itself to perform
>>>>>> the
>>>>>> lookup.  But I also need the lookup, upon completion, to relaunch the
>>>>>> mkdir task.
>>>>>> 
>>>>>> Speaking it over with some of my lab mates, they seem to think I am
>>>>>> either overthinking it or trying to use the wrong tool for the job.
>>>>>> 
>>>>>> Is this a usecase you guys are familiar with?  Can you shed some
>>>>>> light on
>>>>>> this situation?
>>>>>> 
>>>>>> I feel like there should be a way to do this without getting into
>>>>>> callback/completion function hell.
>>>>>> 
>>>>>> Thanks,
>>>>>> Joe Scott
>>>>>> Clemson University
>>>>>> _______________________________________________
>>>>>> codes-ross-users mailing list
>>>>>> codes-ross-users at lists.mcs.anl.gov
>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>>>>> _______________________________________________
>>>>> codes-ross-users mailing list
>>>>> codes-ross-users at lists.mcs.anl.gov
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>>>> _______________________________________________
>>>> codes-ross-users mailing list
>>>> codes-ross-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>>> _______________________________________________
>>> codes-ross-users mailing list
>>> codes-ross-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>> 
>> _______________________________________________
>> codes-ross-users mailing list
>> codes-ross-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
> 
> _______________________________________________
> codes-ross-users mailing list
> codes-ross-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users