[codes-ross-users] CODES LP Design Question

Phil Carns carns at mcs.anl.gov
Wed Apr 8 09:44:42 CDT 2015


I've never tried to modularize steps like this in a single LP, but it is 
a little similar to some cross-LP scenarios I've run into before.  Here 
comes a long-winded example :)

Imagine you have two LPs (that run on the same simulated node), called 
thing1 and thing2.  You want the thing1 procedure to invoke thing2, and 
then for  thing1 to continue once thing2 is done.

If I were to do this now I think I would set it up like this. Assume 
that thing1 and thing2 each have their own .c and .h files.

thing2.h has this prototype: void start_thing2(lp, gid, completion_fn_ptr);

This function will construct and send an event to the thing2 LP 
(identified by the gid argument) to kick off whatever it is going to 
do.  This function will be executed within a thing1 LP event handler, 
but the definition of the function is in thing2.c.  The purpose of this 
arrangement is to let thing1 send an event to thing2 without having 
access to thing2's event structure definition.  As long as you keep the 
API stable then thing2.c can do whatever it needs to do in terms of 
refactoring its event structure without perturbing thing1.

The lp argument is the lp pointer for the *calling* LP, and is probably 
needed to allocate a new event.

The completion_fn_ptr is a function pointer to a function defined by the 
caller (thing1 in this case) that sends an event back to the original 
LP.  This is the mechanism for thing2 to tell thing1 it is done so that 
thing1 can continue operation.  Since it is an opaque function pointer, 
thing2 remains oblivious to the event structure definition of thing1 (or 
even what kind of LP thing1 is).  You can make the completion_fn_ptr 
signature anything that you like, so if thing2 has some result to report 
back, then it can be passed as an argument to that function and stuffed 
into the event that heads back to thing1.

Of course you can also add whatever arbitrary arguments you want to 
start_thing2() to pass input parameters it needs to start whatever 
thing2 is doing.  Those would get stuffed into the event destined for 
thing2.

Unfortunately I haven't actually done this so I don't have an example.  
My thinking on how to organize this has evolved over time and I haven't 
gotten around to trying this particular permutation :)  I've done 
something very similar except for the function pointer; I have a few 
examples have an explicitly defined function that jumps back from thing2 
to thing1.  That works fine, but it is bad news if you want more than 
one type of LP to be able to call thing2.

A related theme here is that I'm a fan (possibly in the minority on 
this) of splitting up different complex procedures that happen on the 
same simulated node into separate LPs.  As long as you have them share 
node resources (NIC, storage, etc.) then the net result is the same in 
terms of the simulation outcome.  The positive is that you don't have 
event handler explosion (big switch statements to handle different 
cases).  The negative is that you have to jump between these LPs by 
sending events with small timestamps.  Once you start doing that, then 
parallel conservative execution is practically out of the question; you 
have to do serial execution or parallel optimistic execution to handle 
those extra small events in a reasonable way.

This might be overkill for your scenario, but I thought I would throw it 
out there.  I wanted to collect my thoughts on this anyway.  In general 
I don't think there is any particular way around this without juggling 
callbacks and such; really the best thing you can hope for right now is 
to minimize entanglement between different LP types in your code.

-Phil

On 04/08/2015 06:47 AM, Ross, Robert B. wrote:
> I think the way to think about this is that the event payload carries the state associated with the operation and gets passed around between LPs as needed to simulate the steps. If you do this right, there shouldn't be local LP state associated with the operation -- everything is in the event payload that is needed.
>
> There might be state at LPs associated with the LP itself, such as what objects exist or what names are in the namespace. Or not, if you are just assuming things exist for instance.
>
> John, Phil, and/or Misbah may need to correct me...of course.
>
> -- Rob
>
>> On Apr 7, 2015, at 4:09 PM, Jenkins, Jonathan P. <jenkins at mcs.anl.gov> wrote:
>>
>> Hi Joe,
>>
>> So if I understand right, the context is you want to simulate some POSIX
>> operations of interest, such as mkdir, and that the client is capable of
>> performing the stat (the lookup) using purely local information (hence the
>> self-event)?
>>
>> In general, it's useful to think of events in these systems as RPC calls,
>> where the event structure you are setting up contains the parameters, and
>> there are corresponding return parameters that the calling LP expects to
>> receive. In this frame of thought, both RPC calls and returns issue a
>> discrete event. In your example, you could have your "mkdir" RPC event
>> handler perform an additional "lookup/stat" RPC to check the path's
>> existence, and upon receiving the "return" event, either make the
>> directory or fail.
>>
>> Unfortunately, event-driven programming more-or-less necessitates
>> callback-heavy code, which can be quite awkward in some contexts. We've
>> talked in the past about ways to at least standardize the way we do this
>> in the context of CODES, but nothing has materialized as of yet.
>>
>> Hope that helps. Something tells me that the assumption about clients
>> being capable of performing metadata operations without outside
>> interaction with e.g. a storage server is not quite right, though I could
>> be misunderstanding.
>>
>> Thanks,
>> John
>>
>>> On 4/7/15, 3:43 PM, "Joe Scott" <tscott2 at g.clemson.edu> wrote:
>>>
>>> Hello All,
>>>
>>> I am having a hard time wrapping myself around the programming paradigm
>>> here, and I wonder if you might offer some guidance on a better way to
>>> use CODES.
>>>
>>> So I am trying to process these higher level tasks (POSIX tasks like
>>> mkdir) by launching the subevents as separate processes.
>>>
>>> The specific case that is tying me in knots is a user issuing a mkdir.
>>> It launches the mkdir event handler, which needs to perform a lookup on
>>> the path of the mkdir.
>>>
>>> So I need to send an event from this client LP to itself to perform the
>>> lookup.  But I also need the lookup, upon completion, to relaunch the
>>> mkdir task.
>>>
>>> Speaking it over with some of my lab mates, they seem to think I am
>>> either overthinking it or trying to use the wrong tool for the job.
>>>
>>> Is this a usecase you guys are familiar with?  Can you shed some light on
>>> this situation?
>>>
>>> I feel like there should be a way to do this without getting into
>>> callback/completion function hell.
>>>
>>> Thanks,
>>> Joe Scott
>>> Clemson University
>>> _______________________________________________
>>> codes-ross-users mailing list
>>> codes-ross-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>> _______________________________________________
>> codes-ross-users mailing list
>> codes-ross-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
> _______________________________________________
> codes-ross-users mailing list
> codes-ross-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users



More information about the codes-ross-users mailing list