[Swift-devel] Cache error in Swift

Jonathan Monette jonmon at mcs.anl.gov
Thu Sep 15 10:54:53 CDT 2011


My mappings seem to be correct.  It does not look like I am trying to map to things to the same file.

On Sep 15, 2011, at 5:57 AM, Michael Wilde wrote:

> Jon, can you log your mappings? Or can you verify them from the Swift logs?
> 
> - Mike
> 
> ----- Original Message -----
>> From: "Jonathan Monette" <jonmon at mcs.anl.gov>
>> To: "Mihael Hategan" <hategan at mcs.anl.gov>
>> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Michael Wilde" <wilde at mcs.anl.gov>
>> Sent: Wednesday, September 14, 2011 11:03:11 PM
>> Subject: Re: Cache error in Swift
>> I am pretty sure I am not. All the input files are unique and the
>> output files are all mapped using the regexp mapper and appending
>> proj_ to the front.
>> 
>> But I will check anyways. I seem to be having the error without
>> replications as well. Let me try a couple more runs to see if I added
>> something erroneously.
>> 
>> On Sep 14, 2011, at 10:43 PM, Mihael Hategan wrote:
>> 
>>> I suppose it might be possible to have a race condition when using
>>> replication such that two jobs in the same replication group
>>> complete.
>>> 
>>> But before I go and dig into that, can you double-check that you are
>>> not
>>> mapping two things to the same file?
>>> 
>>> On Wed, 2011-09-14 at 17:33 -0500, Jonathan Monette wrote:
>>>> Hello,
>>>>  I just ran my SwiftMontage scripts again with the most recent
>>>>  build of 0.93 source. I received this error
>>>> Execution failed:
>>>> 	The cache already contains
>>>> 	pads:montage-20110914-1717-nrothtg3/shared/proj_dir/proj_2mass-atlas-000713s-j0870197.fits.
>>>> 
>>>> I received this error after 3652 tasks. All the files are located
>>>> in /home/jonmon/PADS/Swift/SwiftMontage/big/run.0013 on the CI
>>>> machines.
>>>> 
>>>> I wanted to try replications. I have noticed in my scripts that
>>>> when PADS is full of jobs Swift doesn't try to resubmit to Beagle
>>>> even though Beagle shows through showq that it has room for jobs. I
>>>> thought replications would use both sites more efficiently. What I
>>>> mean is I thought that replications would replicate a jobs onto
>>>> Beagle since PADS is taking so long just sitting in the queue.
>>>> 
>>>> Please educate me if this is not what I should be doing and if in
>>>> fact there is no work around to this problem.
>>> 
>>> 
> 
> -- 
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 




More information about the Swift-devel mailing list