[Swift-devel] Issues in running 0.94-latest on Tukey and Vesta

Michael Wilde wilde at mcs.anl.gov
Thu Aug 1 08:22:34 CDT 2013


Hi Mihael,

We're encountering several issues in this environment, and Im hoping you can jump in on this and hopefully fix/patch things to work well enough for initial use.

Tukey is a rather vanilla Intel x86_64 cluster running Cobalt. It shares a GridFTP server with Mira and Cetus, the production BG/Q systems.

Vesta is a 2-rack development BG/Q. Its got a PPC head node, IBM Java 6, and no GridFTP server. Cobalt scheduler with unique BG/Q-specific "subjob" capability which we are trying to exploit.

Both systems have cryptocard-only ssh access.

Here's the problems so far:

- Swift on Tukey cant access the Tukey GridFTP server. (Need to try to other places) - I sent you the error yesterday

- Swift on Tukey cant run jobs on vesta through automated, tunneled coasters (using the methods that work well on Orthros). It hits problems in authentication, due to what looks like incompatibilities with IBM Java.

- Swift on Tukey can run jobs to a persistent automatic coaster server on Vesta, but not reliably.  Every other run works/fails/works/fails, then fails hard.  I think this was achieved with Oracle Java 1.7 on Tukey and IBM Java on Vesta.

This message is mainly a heads-up. I'll create tickets for each of these to associate logs and messages with each problem, but I need more hands to keep all this moving forward.

- Mike



More information about the Swift-devel mailing list