[Swift-devel] Approaches to running Swift off-head-node

Michael Wilde wilde at mcs.anl.gov
Fri Apr 29 08:52:50 CDT 2011


Ketan, for Beagle use, but also for general cluster use, can you draw a few diagrams to show the alternatives, and consider how to implement them?

1. swift cmd on head node
  a) cluster scheduler provider (PBS, SGE, etc)
  b) coaster provider over cluster scheduler provider

2. swift cmd on external host
  a) submits jobs via GRAM
     i) to cluster scheduler
     ii) to coasters over cluster scheduler
  b) submits jobs via SSH
  c) submits jobs to factory-managed coaster workers 

3. swift cmd on compute node
  a) submits jobs as if on head node
  b) submits jobs to factory-managed coaster workers 


In some of the configurations above there are additional variants based on where the coaster service runs and how its started.

Note that the desire to keep resource-intensive processes off the login nodes applies to all clusters, not just Beagle. (The swift command and even the coaster service can be resource-intensive when running highly parallel scripts with high task rates).

We should select a small subset of the possible configs to implement, test, document and support for users.

For Beagle, we started with 1b. (1a is not viable, as its unable to readily utilize multicore nodes). The email thread from yesterday was around approach 2b.  I'd suggest considering how to do 2c, as its similar to what Allan has run on OSG and might result in a common "coaster factory" script with common logic for how to control the level of factor worker job submission.

For now, though, it seems best to continue with the "2b" approach (external-ssh-coasters) to see how well it works. Its got the current obstacle of 

- Mike



-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list