[Swift-devel] LQCD meeting at Fermi
Mihael Hategan
hategan at mcs.anl.gov
Sat Jun 2 03:33:31 CDT 2007
On Fri, 2007-06-01 at 16:59 -0500, Veronika Nefedova wrote:
> Hi,
>
> Yong and I have met with Xian-He and his team today to talk over
> their current problems with the production swift code.
>
> Some of the major issues we talked about:
>
> - Sperate of concern: SwiftScript could be made to just describe the
> abstract interfaces and data flows, and the app blocks could be pushed
> into some separate specifications ( in a repository or something ), in
> which other scripting lanugages can be used (e.g. python) to specify how
> to invoke an actual application.
How's that different from application wrappers?
>
> - Dealing with absolute path:
> LQCD uses dcache, which requires copying to/from some absolute path.
This, I think, is the same as the ability to have non-local input and
output files.
>
> - Run clean up jobs outside pbs (i.e. using the fork manager instead)
We've discussed this before, and there are two choices:
1. Use the file provider. This may be inefficient because most of them,
in particular GridFTP, don't have a recursive delete. The local one,
which they are using does. This may imply another configuration option.
2. Make sure there's always a fork job manager there and use that. This
means that the local PBS provider needs to become a job manager to the
local provider rather than a stand-alone provider.
>
> - parameter problem: need to override things in tc.data, sites.xml, like
> number of nodes for MPI jobs
> possible solution: put profile specification back in. (but we do not
> have derivations, in which we were able to put some profiles).
Can you explain that? VDS != Swift. And we shouldn't talk about Swift
having some literal thing from VDS, but rather the bit that achieves
similar functionality.
> template based sites.xml and tc.data (generate the actual config
> files
> using some templates and user supplied values at runtime)
About sites.xml, we discussed in an email exchange the possibility of
doing that. Luckily, in Swift, sites.xml is a karajan script, so it can
do things like import("anothersites.xml") and so on.
>
> - DB-mapper: users have an elaborate input data structures, keep it
> in the DB, so it would be nice to have a mapper that would read the
> input from the DB. This feature is in the works (?)
>
> -intermediate results problem -- the same as MolDyn: need to have an
> ability to specify which file to keep and which file not.
>
> - quoting problem:
> MPIrun does not deal correctly with "" that are passed to wrapper.sh
> I remember there was also quoting issue with condor queues.
This is a problem with their mpirun. However, I guess the PBS provider
could have a flag to do extra quoting for certain job types.
>
> We also talked about using Falkon. But since LQCD uses dedicated
> resources
> (600 or more nodes) and pbs queue checking time is set to around 10s, it
> is not a big issue for them to run large number of jobs.
The last thing we want with them is throw in another thing that might
have problems in the stack.
>
> None of these except for the absolute path problem is a show-
> stoppers, next
> we'll try to get their swiftscript running, and push some of the
> requests
> into 0.3 features.
>
> Yong and Nika
>
>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
More information about the Swift-devel
mailing list