[Swift-devel] Re: [Swift-user] Failed to link input file

Mihael Hategan hategan at mcs.anl.gov
Tue Jun 3 15:30:18 CDT 2008


On Tue, 2008-06-03 at 15:17 -0500, lixi at uchicago.edu wrote:
> >In this case, you already did. The fact that you get any 
> message
> >whatsoever from wrapper.sh ("cannot link input file" is one 
> of them)
> >means that it did work at the point the job was started.
> >
> 
> I don't mean checking it during the execution of Swift 
> workflow. I mean that can I run some extra pieces of scripts 
> to make sure the availability of the shared file systems on 
> remote sites.

It's not as much "on remote sites" as it is "on each node of the remote
site".

What I was, however, trying to point out was that in the likely short
period of time between the wrapper being started and it trying to link
the input files the node went from good to bad. 

So I'm not sure how useful it would be to increase that period of time
even further.

But no, I don't have such a script. I wouldn't even know how to do it,
since I'm not aware of how I could force queuing systems to send my jobs
to specific nodes. Well, besides sending a large number of jobs and
hoping that eventually each node will get at least one. Which is silly.

> 
> You know, currently, I'm running some calibration scripts 
> which include globus-job-run and globus-url-copy tasks to 
> learn the performance of remote sites. Based on such 
> results, I could give the initial scores and filter some 
> sites before running Swift workflow. So since the shared 
> file system also leads to the failure of the workflow, I 
> think that it is necessary to add the evaluation into my 
> current scoring methods.

I have a suspicion Swift itself causes the problem. It's like looking at
things with a hammer.

Perhaps Ben can look at the performance data an get some insight.

> 
> Thanks,
> 
> Xi




More information about the Swift-devel mailing list