[Swift-devel] LQCD meeting at Fermi

Mihael Hategan hategan at mcs.anl.gov
Sat Jun 2 03:33:31 CDT 2007


On Fri, 2007-06-01 at 16:59 -0500, Veronika Nefedova wrote:
> Hi,
> 
> Yong and I have met with Xian-He and his team today to talk over  
> their current problems with the production swift code.
> 
> Some of the major issues we talked about:
> 
> - Sperate of concern: SwiftScript could be made to just describe the
> abstract interfaces and data flows, and the app blocks could be pushed
> into some separate specifications ( in a repository or something ), in
> which other scripting lanugages can be used (e.g. python) to specify how
> to invoke an actual application.

How's that different from application wrappers?

> 
> - Dealing with absolute path:
>    LQCD uses dcache, which requires copying to/from some absolute path.

This, I think, is the same as the ability to have non-local input and
output files.

> 
> - Run clean up jobs outside pbs (i.e. using the fork manager instead)

We've discussed this before, and there are two choices:
1. Use the file provider. This may be inefficient because most of them,
in particular GridFTP, don't have a recursive delete. The local one,
which they are using does. This may imply another configuration option.
2. Make sure there's always a fork job manager there and use that. This
means that the local PBS provider needs to become a job manager to the
local provider rather than a stand-alone provider.

> 
> - parameter problem: need to override things in tc.data, sites.xml, like
> number of nodes for MPI jobs
>    possible solution: put profile specification back in. (but we do not
> have derivations, in which we were able to put some profiles).

Can you explain that? VDS != Swift. And we shouldn't talk about Swift
having some literal thing from VDS, but rather the bit that achieves
similar functionality.

>    template based sites.xml and tc.data (generate the actual config  
> files
> using some templates and user supplied values at runtime)

About sites.xml, we discussed in an email exchange the possibility of
doing that. Luckily, in Swift, sites.xml is a karajan script, so it can
do things like import("anothersites.xml") and so on.

> 
> - DB-mapper: users have an elaborate input data structures, keep it  
> in the DB, so it would be nice to have a mapper that would read the  
> input from the DB. This feature is in the works (?)
> 
> -intermediate results problem -- the same as MolDyn: need to have an  
> ability to specify which file to keep and which file not.
> 
> - quoting problem:
>    MPIrun does not deal correctly with "" that are passed to wrapper.sh
> I remember there was also quoting issue with condor queues.

This is a problem with their mpirun. However, I guess the PBS provider
could have a flag to do extra quoting for certain job types.

> 
> We also talked about using Falkon. But since LQCD uses dedicated  
> resources
> (600 or more nodes) and pbs queue checking time is set to around 10s, it
> is not a big issue for them to run large number of jobs.

The last thing we want with them is throw in another thing that might
have problems in the stack.

> 
> None of these except for the absolute path problem is a show- 
> stoppers, next
> we'll try to get their swiftscript running, and push some of the  
> requests
> into 0.3 features.
> 
> Yong and Nika
> 
> 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 




More information about the Swift-devel mailing list