[Swift-devel] Fixing and testing Swift on local and common clusters

wilde at mcs.anl.gov wilde at mcs.anl.gov
Tue Jun 1 09:59:22 CDT 2010


Hi David, Jon, Arjun, Dennis,

This week I'd like to have us all coordinate around trying, fixing, documenting, and testing the behavior of Swift on several clusters that many local users use:

TeraPort, PADS, Fusion, Abe, QueenBee,  Ranger, Godzilla, and SisBoomBah

I'll explain more later in the day and week, but I wanted to give you a heads-up on this. We'll do this in a way that ties in to each of your different project focus areas.

The first 5 of these clusters are PBS, the last 3 are SGE.

On the PBS cluster, the issue is that the clusters vary in scheduling policy (allocating resources in cores vs nodes). On the latter 3, our SGE driver is fairly new, and still has issues related to interpretation of cores per node and of the local scheduler configuration.

First exercise in this regard is to focus on the first 3 clusters on the list above, and observe the behavior both with and without coasters, with both the stable branch and a locally modified version that I will point you to.

David, how well are the exercises oriented to cluster usage at this point? Lets discuss what the new-user roadmap should be for trying Swift on a cluster, and how to provide the needed environment-specific info for CI and Argonne users vs the broader user community.

- Mike



More information about the Swift-devel mailing list