[codes-ross-users] CODES LP Design Question
Phil Carns
carns at mcs.anl.gov
Wed Apr 8 09:44:42 CDT 2015
I've never tried to modularize steps like this in a single LP, but it is
a little similar to some cross-LP scenarios I've run into before. Here
comes a long-winded example :)
Imagine you have two LPs (that run on the same simulated node), called
thing1 and thing2. You want the thing1 procedure to invoke thing2, and
then for thing1 to continue once thing2 is done.
If I were to do this now I think I would set it up like this. Assume
that thing1 and thing2 each have their own .c and .h files.
thing2.h has this prototype: void start_thing2(lp, gid, completion_fn_ptr);
This function will construct and send an event to the thing2 LP
(identified by the gid argument) to kick off whatever it is going to
do. This function will be executed within a thing1 LP event handler,
but the definition of the function is in thing2.c. The purpose of this
arrangement is to let thing1 send an event to thing2 without having
access to thing2's event structure definition. As long as you keep the
API stable then thing2.c can do whatever it needs to do in terms of
refactoring its event structure without perturbing thing1.
The lp argument is the lp pointer for the *calling* LP, and is probably
needed to allocate a new event.
The completion_fn_ptr is a function pointer to a function defined by the
caller (thing1 in this case) that sends an event back to the original
LP. This is the mechanism for thing2 to tell thing1 it is done so that
thing1 can continue operation. Since it is an opaque function pointer,
thing2 remains oblivious to the event structure definition of thing1 (or
even what kind of LP thing1 is). You can make the completion_fn_ptr
signature anything that you like, so if thing2 has some result to report
back, then it can be passed as an argument to that function and stuffed
into the event that heads back to thing1.
Of course you can also add whatever arbitrary arguments you want to
start_thing2() to pass input parameters it needs to start whatever
thing2 is doing. Those would get stuffed into the event destined for
thing2.
Unfortunately I haven't actually done this so I don't have an example.
My thinking on how to organize this has evolved over time and I haven't
gotten around to trying this particular permutation :) I've done
something very similar except for the function pointer; I have a few
examples have an explicitly defined function that jumps back from thing2
to thing1. That works fine, but it is bad news if you want more than
one type of LP to be able to call thing2.
A related theme here is that I'm a fan (possibly in the minority on
this) of splitting up different complex procedures that happen on the
same simulated node into separate LPs. As long as you have them share
node resources (NIC, storage, etc.) then the net result is the same in
terms of the simulation outcome. The positive is that you don't have
event handler explosion (big switch statements to handle different
cases). The negative is that you have to jump between these LPs by
sending events with small timestamps. Once you start doing that, then
parallel conservative execution is practically out of the question; you
have to do serial execution or parallel optimistic execution to handle
those extra small events in a reasonable way.
This might be overkill for your scenario, but I thought I would throw it
out there. I wanted to collect my thoughts on this anyway. In general
I don't think there is any particular way around this without juggling
callbacks and such; really the best thing you can hope for right now is
to minimize entanglement between different LP types in your code.
-Phil
On 04/08/2015 06:47 AM, Ross, Robert B. wrote:
> I think the way to think about this is that the event payload carries the state associated with the operation and gets passed around between LPs as needed to simulate the steps. If you do this right, there shouldn't be local LP state associated with the operation -- everything is in the event payload that is needed.
>
> There might be state at LPs associated with the LP itself, such as what objects exist or what names are in the namespace. Or not, if you are just assuming things exist for instance.
>
> John, Phil, and/or Misbah may need to correct me...of course.
>
> -- Rob
>
>> On Apr 7, 2015, at 4:09 PM, Jenkins, Jonathan P. <jenkins at mcs.anl.gov> wrote:
>>
>> Hi Joe,
>>
>> So if I understand right, the context is you want to simulate some POSIX
>> operations of interest, such as mkdir, and that the client is capable of
>> performing the stat (the lookup) using purely local information (hence the
>> self-event)?
>>
>> In general, it's useful to think of events in these systems as RPC calls,
>> where the event structure you are setting up contains the parameters, and
>> there are corresponding return parameters that the calling LP expects to
>> receive. In this frame of thought, both RPC calls and returns issue a
>> discrete event. In your example, you could have your "mkdir" RPC event
>> handler perform an additional "lookup/stat" RPC to check the path's
>> existence, and upon receiving the "return" event, either make the
>> directory or fail.
>>
>> Unfortunately, event-driven programming more-or-less necessitates
>> callback-heavy code, which can be quite awkward in some contexts. We've
>> talked in the past about ways to at least standardize the way we do this
>> in the context of CODES, but nothing has materialized as of yet.
>>
>> Hope that helps. Something tells me that the assumption about clients
>> being capable of performing metadata operations without outside
>> interaction with e.g. a storage server is not quite right, though I could
>> be misunderstanding.
>>
>> Thanks,
>> John
>>
>>> On 4/7/15, 3:43 PM, "Joe Scott" <tscott2 at g.clemson.edu> wrote:
>>>
>>> Hello All,
>>>
>>> I am having a hard time wrapping myself around the programming paradigm
>>> here, and I wonder if you might offer some guidance on a better way to
>>> use CODES.
>>>
>>> So I am trying to process these higher level tasks (POSIX tasks like
>>> mkdir) by launching the subevents as separate processes.
>>>
>>> The specific case that is tying me in knots is a user issuing a mkdir.
>>> It launches the mkdir event handler, which needs to perform a lookup on
>>> the path of the mkdir.
>>>
>>> So I need to send an event from this client LP to itself to perform the
>>> lookup. But I also need the lookup, upon completion, to relaunch the
>>> mkdir task.
>>>
>>> Speaking it over with some of my lab mates, they seem to think I am
>>> either overthinking it or trying to use the wrong tool for the job.
>>>
>>> Is this a usecase you guys are familiar with? Can you shed some light on
>>> this situation?
>>>
>>> I feel like there should be a way to do this without getting into
>>> callback/completion function hell.
>>>
>>> Thanks,
>>> Joe Scott
>>> Clemson University
>>> _______________________________________________
>>> codes-ross-users mailing list
>>> codes-ross-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
>> _______________________________________________
>> codes-ross-users mailing list
>> codes-ross-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
> _______________________________________________
> codes-ross-users mailing list
> codes-ross-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
More information about the codes-ross-users
mailing list