[Swift-user] Montage+Swift+Coasters

Emalayan Vairavanathan svemalayan at yahoo.com
Thu Jan 19 19:55:20 CST 2012

Hi Ketan,

This was with swift-0.92.1.Now I have downloaded the latest swift 0.93 and getting totally different error messages with swift 0.93. I can ask Jon about these messages. (These scripts was working well with only Swift)

Please let me know if you have any idea. 



Swift 0.93 swift-r5501 cog-r3350

RunID: 20120119-1749-rjshh1r9
 (input): found 10 files
Progress:  time: Thu, 19 Jan 2012 17:49:20 -0800
Find: http://localhost:1984
Find:  keepalive(120), reconnect - http://localhost:1984
Progress:  time: Thu, 19 Jan 2012 17:49:22 -0800  Stage in:1  Submitted:9
Progress:  time: Thu, 19 Jan 2012 17:49:25 -0800  Active:9  Stage out:1
Progress:  time: Thu, 19 Jan 2012 17:49:26 -0800  Stage out:3  Finished successfully:7
Progress:  time: Thu, 19 Jan 2012 17:49:28 -0800  Active:1  Finished successfully:10
Progress:  time: Thu, 19 Jan 2012 17:49:29 -0800  Stage in:1  Submitting:11  Submitted:6  Finished successfully:12
Progress:  time: Thu, 19 Jan 2012 17:49:30 -0800  Stage in:4  Submitted:1  Active:6  Stage out:2  Finished successfully:17
Progress:  time: Thu, 19 Jan 2012 17:49:31 -0800  Active:1  Finished successfully:30
Exception in mConcatFit:
Arguments: [_concurrent/status_tbl-7a8340c2-045d-4039-a77c-00429b78d9c9-5, fits.tbl, stat_dir]
Host: localhost
Directory: SwiftMontage-20120119-1749-rjshh1r9/jobs/b/mConcatFit-b1sa4vlk
- - -

Caused by: null
Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 520
Execution failed:
    back_list:Table = org.griphyn.vdl.mapping.DataDependentException - Closed not derived due to errors in data dependencies

 From: Ketan Maheshwari <ketancmaheshwari at gmail.com>
To: Emalayan Vairavanathan <svemalayan at yahoo.com> 
Cc: swift user <swift-user at ci.uchicago.edu> 
Sent: Thursday, 19 January 2012 4:49 PM
Subject: Re: [Swift-user] Montage+Swift+Coasters


From your symptoms, it seems you are facing the same issue as I've been. Could you tell more about the amount of data that needs to be staged to run the Montage stages during which these warnings turn up? How much time elapses since the start of your workflow after which you see these messages?

Also, what version of Swift is this?


On Thu, Jan 19, 2012 at 5:51 PM, Emalayan Vairavanathan <svemalayan at yahoo.com> wrote:

Dear All,
>I have a problem in running Montage with Coasters (in our local cluster - no batch schedulers). After few stages the swift run-time continuously prints the warnings below. Any ideas ? Should I increase the heartbeat count ?
>Everything works fine when I try to run the same montage-scripts with swift on a single machine.
>Thank you
>2012-01-19 15:38:09,207-0800 WARN  Command Command(119, HEARTBEAT): handling reply timeout; sendReqTime=120119-153609.206, sendTime=120119-153609.206, 
>2012-01-19 15:38:09,207-0800 INFO  Command Command(119, HEARTBEAT): re-sending
>2012-01-19 15:38:09,209-0800 WARN  Command Command(119, HEARTBEAT)fault was: Reply timeout
>        at org.globus.cog.karajan.workflow.service.commands.Command.handleReplyTimeout(Command.java:288)
>        at org.globus.cog.karajan.workflow.service.commands.Command$Timeout.run(Command.java:293)
>        at java.util.TimerThread.mainLoop(Timer.java:534)
>        at java.util.TimerThread.run(Timer.java:484)
>Swift-user mailing list
>Swift-user at ci.uchicago.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20120119/c80e6d98/attachment.html>

More information about the Swift-user mailing list