[Swift-user] coaster reply timeout heartbeats

Mihael Hategan hategan at mcs.anl.gov
Thu Oct 21 19:09:34 CDT 2010


Heartbeats are a way to establish (within a certain interval of time)
that the connection is still alive.

Only one side needs to send heartbeats, and they should be replied to
without question.

A client detects a bad connection when no reply is received to a
heartbeat. The "passive" side detects a bad connection when no heartbeat
has been received in 2*the interval.

On Thu, 2010-10-21 at 17:51 -0500, Allan Espinosa wrote:
> Hi,
> 
> Do the HEARTBEAT timeouts occur because the swift client is expecting
> them by default but the coaster service has it disabled by default?
> 
> Below are the logs that gave me the idea.
> 
> Swift client:
> 2010-10-21 10:49:56,404-0500 WARN  Command Command(6761, HEARTBEAT):
> handling reply timeout; sendReqTime=101021-104756.388,
> sendTime=101021-104756.388, now=101021-104956.404
> 2010-10-21 10:49:56,404-0500 INFO  Command Command(6761, HEARTBEAT): re-sending
> 2010-10-21 10:49:56,404-0500 WARN  Command Command(6761,
> HEARTBEAT)fault was: Reply timeout
> org.globus.cog.karajan.workflow.service.ReplyTimeoutException
>         at org.globus.cog.karajan.workflow.service.commands.Command.handleReplyTimeout(Command.java:280)
>         at org.globus.cog.karajan.workflow.service.commands.Command$Timeout.run(Command.java:285)
>         at java.util.TimerThread.mainLoop(Timer.java:512)
>         at java.util.TimerThread.run(Timer.java:462)
> 2010-10-21 10:49:56,588-0500 INFO  GSSChannel Connected to
> https://communicado.ci.uchicago.edu:61999
> 2010-10-21 10:49:56,588-0500 INFO
> AbstractStreamKarajanChannel$Multiplexer (1) Scheduling
> GSSCChannel-https://communicado.ci.uchicago.edu:61999(6)[1543987498:
> {}] for addition
> 
> Persistent service (passive workers):
> Local contacts: [http://128.135.125.17:60999]
> Started local service: http://128.135.125.17:60999
> Started coaster service: https://128.135.125.17:61999
> Started coaster service: https://128.135.125.17:61999
> GSSSChannel-null(0)[1205215856: {}]: Disabling heartbeats (config is null)
> Multiplexer 0 started
> (1) Scheduling GSSSChannel-null(1)[1205215856: {}] for addition
> nullChannel started
> Multiplexer 1 started
> Channel id: u-5410312-12bd0f72d5e--8000-u1d283670-12bd0f72d69--8000
> MetaChannel: 409971196[1205215856: {}] -> null: Disabling heartbeats
> (disabled in config)
> MetaChannel: 409971196[1205215856: {}] -> null.bind ->
> GSSSChannel-null(1)[1205215856: {}]
> Sending Command(1, RLOG) on GSSSChannel-null(1)[1205215856: {}]
> Plan time: 1
> Plan time: 1
> Plan time: 1
> Plan time: 1
> 
> 





More information about the Swift-user mailing list