[Mochi-devel] An overview of application usage scenarios

Fri Jun 28 06:44:25 CDT 2019

Hi Srinivasan,

Glad that the tutorials were useful! Regarding your questions:

1.a: HEPnOS is intended to be deployed separately from the application. The physicists we are working with deploy it in a set of containers. We envision leaving it running for the duration of an experimental campaign (which may be weeks to months). FlameStore is supposed to act like a cache for a single workflow. It lives for the duration of the workflow. As for SDSDKV, I'm not too familiar with that one but I think it's supposed to be deployed for the duration of an application as well.

1.b: HEPnOS is not part of the application (though it could be deployed as part of it). FlameStore is part of the application (same MPI_COMM_WORLD). For SDSDKV I don't know.

1.c: Yes, HEPnOS is intended to be long-running.

2: I wish we had such a code; our physicists colleagues are working on it right now.

Thanks,

Matthieu

On 27/06/2019, 17:31, "mochi-devel on behalf of Srinivasan Ramesh via mochi-devel" <mochi-devel-bounces at lists.mcs.anl.gov on behalf of mochi-devel at lists.mcs.anl.gov> wrote:

    Hi team,

    @Mattheiu: Thanks for the wiki tutorial for Mochi. I found it extremely 
    useful for my understanding and tried out the hands-on tutorials.

    I re-read the PDSW "Methodology for rapid development..." paper and 
    installed HePNOS locally on my laptop. A few questions come to mind:

    1. For each of the popular data-services mentioned in the paper 
    (Framestore, HePNOS, SDSKV), what is the model of usage/topology? 
    Specifically:
        a. Are these services part of a workflow? Meaning, a node allocation 
    is managed, and the services are long-running for the duration of the 
    workflow. Jobs within the workflow come and go, and use the service 
    during their execution.
        b. Are these services part of the application itself? Meaning a 
    "regular" MPI job where the service is built into each MPI process and 
    loaded as a library local to the process.
        c. Is it possible that certain services are long-running on the 
    system "forever" (reduces to (a) I guess?)

    The methodology paper hints at the topology but doesn't really provide a 
    concrete description. With regard to performance measurement,
    I am fully aware that data-services can span the entire range of 
    possibilities. However, I think it may not be a bad idea to start with 
    specific scenarios in mind and then go from there onto more general 
    cases once we have a grasp on the problem.

    2. Can I get access to a high-energy physics code that actually uses the 
    HepNOS service? Can I run this setup on my laptop?

    3. @Phil: I remember you mentioning that you had a branch where you had 
    developed a dapper-like request tracing infrastructure? Could you kindly
    point me to this?

    Regards,
    -- 
    Srinivasan Ramesh
    _______________________________________________
    mochi-devel mailing list
    mochi-devel at lists.mcs.anl.gov
    https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel
    https://www.mcs.anl.gov/research/projects/mochi