[Mochi-devel] An overview of application usage scenarios
Dorier, Matthieu
mdorier at anl.gov
Fri Jun 28 06:44:25 CDT 2019
Hi Srinivasan,
Glad that the tutorials were useful! Regarding your questions:
1.a: HEPnOS is intended to be deployed separately from the application. The physicists we are working with deploy it in a set of containers. We envision leaving it running for the duration of an experimental campaign (which may be weeks to months). FlameStore is supposed to act like a cache for a single workflow. It lives for the duration of the workflow. As for SDSDKV, I'm not too familiar with that one but I think it's supposed to be deployed for the duration of an application as well.
1.b: HEPnOS is not part of the application (though it could be deployed as part of it). FlameStore is part of the application (same MPI_COMM_WORLD). For SDSDKV I don't know.
1.c: Yes, HEPnOS is intended to be long-running.
2: I wish we had such a code; our physicists colleagues are working on it right now.
Thanks,
Matthieu
On 27/06/2019, 17:31, "mochi-devel on behalf of Srinivasan Ramesh via mochi-devel" <mochi-devel-bounces at lists.mcs.anl.gov on behalf of mochi-devel at lists.mcs.anl.gov> wrote:
Hi team,
@Mattheiu: Thanks for the wiki tutorial for Mochi. I found it extremely
useful for my understanding and tried out the hands-on tutorials.
I re-read the PDSW "Methodology for rapid development..." paper and
installed HePNOS locally on my laptop. A few questions come to mind:
1. For each of the popular data-services mentioned in the paper
(Framestore, HePNOS, SDSKV), what is the model of usage/topology?
Specifically:
a. Are these services part of a workflow? Meaning, a node allocation
is managed, and the services are long-running for the duration of the
workflow. Jobs within the workflow come and go, and use the service
during their execution.
b. Are these services part of the application itself? Meaning a
"regular" MPI job where the service is built into each MPI process and
loaded as a library local to the process.
c. Is it possible that certain services are long-running on the
system "forever" (reduces to (a) I guess?)
The methodology paper hints at the topology but doesn't really provide a
concrete description. With regard to performance measurement,
I am fully aware that data-services can span the entire range of
possibilities. However, I think it may not be a bad idea to start with
specific scenarios in mind and then go from there onto more general
cases once we have a grasp on the problem.
2. Can I get access to a high-energy physics code that actually uses the
HepNOS service? Can I run this setup on my laptop?
3. @Phil: I remember you mentioning that you had a branch where you had
developed a dapper-like request tracing infrastructure? Could you kindly
point me to this?
Regards,
--
Srinivasan Ramesh
_______________________________________________
mochi-devel mailing list
mochi-devel at lists.mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel
https://www.mcs.anl.gov/research/projects/mochi
More information about the mochi-devel
mailing list