[Swift-devel] Montage workload

Jonathan Monette jonmon at mcs.anl.gov
Thu Apr 12 09:12:10 CDT 2012


So this looks like a problem in the Swift code.  The hang checker is activated at the start of the execution which is not good.  Could you point me to where you ran this? Was this on surveyor?  If it was not on surveyor I can give it a try.  It looks like the projection phase is trying to project empty files.   This could be due to the files actually being empty(I sent corrupted data) or Swift cannot find the files but ran mProjectPP anyways.

On Apr 12, 2012, at 12:44 AM, Emalayan Vairavanathan wrote:

> Hi Jon,
> 
> I tired to run the large Montage-workload which I got from you recently on both PVFS and MosaStore. With both systems the workload failed (I copied the  standard output messages below). I guess this is due to the problem with the workload (because the system works with the small workloads).  
> Do you have any idea ? Did this workload work for you ?
> 
> Thank you
> Emalayan
> 
> 
> Swift trunk swift-r5704 (swift modified locally) cog-r3361 (cog modified locally)
> 
> RunID: 20120412-0530-vj96mfz5
> No events in 10s.
> 
> Registered futures:
> ----
> 
> Waiting threads:
> ----
> 
> No events in 10s.
> 
> Registered futures:
> ----
> 
> Waiting threads:
> ----
> 
> No events in 10s.
> 
> Registered futures:
> ----
> 
> Waiting threads:
> ----
> 
> No events in 10s.
> 
> Registered futures:
> ----
> 
> Waiting threads:
> ----
> 
>  (input): found 4116 files
> No events in 10s.
> 
> Registered futures:
> ----
> 
> Waiting threads:
> ----
> 
> Failed to acquire exclusive lock on log file.
> Progress:  time: Thu, 12 Apr 2012 05:31:02 +0000
> Progress:  time: Progress:  time: Thu, 12 Apr 2012 05:31:11 +0000Thu, 12 Apr 2012 05:31:11 +0000  Initializing:2  Initializing:2
> 
> Progress:  time: Thu, 12 Apr 2012 05:31:12 +0000  Initializing:1023  Selecting site:1
> Progress:  time: Thu, 12 Apr 2012 05:31:13 +0000  Selecting site:1020  Initializing site shared directory:1  Stage in:3
> Progress:  time: Thu, 12 Apr 2012 05:31:15 +0000  Selecting site:1018  Stage in:5  Submitting:1
> Find: http://172.17.3.12:12346
> Find:  keepalive(120), reconnect - http://172.17.3.12:12346
> Passive queue processor initialized. Callback URI is http://172.17.3.12:12345
> Progress:  time: Thu, 12 Apr 2012 05:31:16 +0000  Selecting site:1018  Active:6
> Progress:  time: Thu, 12 Apr 2012 05:31:24 +0000  Selecting site:1018  Active:5 Failed but can retry:1
> EXCEPTION Exception in mProjectPP_wrap:
> Arguments: [-X, raw_dir/2mass-atlas-991207s-j1130256.fits, proj_dir/proj_2mass-atlas-991207s-j1130256.fits, header.hdr]
> Host: persistent-coasters
> Directory: SwiftMontage-20120412-0530-vj96mfz5/jobs/e/mProjectPP_wrap-eozxvrpk
> stderr.txt: 
> stdout.txt: [struct stat="ERROR", msg="All pixels are blank."]
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20120412/c0628d66/attachment.html>


More information about the Swift-devel mailing list