[Swift-devel] execution.retries

lixi at uchicago.edu lixi at uchicago.edu
Wed Jun 11 16:02:04 CDT 2008


>Hmm?
>2008-06-10 10:47:27,429-0500 DEBUG 
WeightedHostScoreScheduler Releasing
>contact 7
>2008-06-10 10:47:27,430-0500 INFO  
WeightedHostScoreScheduler Sorted:
>[OSG_LIGO_MIT:21.822(51.667):2/4]
>2008-06-10 10:47:27,430-0500 DEBUG 
WeightedHostScoreScheduler Rand:
>15.78147400479908, sum: 100.07797485652034
>2008-06-10 10:47:27,431-0500 DEBUG 
WeightedHostScoreScheduler Next
>contact: OSG_LIGO_MIT:21.822(51.667):2/4
>
>
>That seems to be your only contact. Running
>cat /home/lixi/newswift/latest/score/3500/workflowtest-
20080610-1045-58kc7p6f.log|grep "Next contact: 
OSG_LIGO_MIT"|wc
>
>produces: 4376   30632  469760
>
>So there's 4376 site selections there.
>
>If you remove the |wc you can see the evolution of the 
score.
>
>You only seem to have one site there. Re-trying means full 
re-scheduling
>(so maybe another site if there is one).
>
>There isn't much marking the start of a try besides the 
scheduler
>allocating a site. The successful end of a try is 
represented by
>"JOB_END". Failed -> "APPLICATION_EXCEPTION".
>

Thanks, I see. :)

However, there is another question. Please check my another 
log file on CI /home/lixi/newswift/test1/workflowtest-
20080611-0956-z09bzjs5.log. In that file:

2008-06-11 09:56:33,899-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193768) 
setting status to Submitting
2008-06-11 09:56:35,823-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193768) 
setting status to Submitted
2008-06-11 09:56:35,823-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193768) 
setting status to Active
2008-06-11 09:56:35,877-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193768) 
setting status to Completed
2008-06-11 09:56:35,877-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:-0.010
(0.994):1/2, 0.01)
2008-06-11 09:56:35,878-0500 DEBUG 
WeightedHostScoreScheduler Old score: -0.010, new score: 
0.000
2008-06-11 09:56:35,878-0500 INFO  LateBindingScheduler Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193768) 
Completed. Waiting: 0, Running: 0. Heap size: 12M, Heap 
free: 1M, Max heap: 63M
2008-06-11 09:56:35,885-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:0.000
(1.000):1/2, -0.2)
2008-06-11 09:56:35,885-0500 DEBUG 
WeightedHostScoreScheduler Old score: 0.000, new score: -
0.200
2008-06-11 09:56:35,896-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193771) 
setting status to Submitting
2008-06-11 09:56:35,896-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193771) 
setting status to Submitted
2008-06-11 09:56:35,897-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193771) 
setting status to Active
2008-06-11 09:56:36,129-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193771) 
setting status to Completed
2008-06-11 09:56:36,129-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:-0.200
(0.889):1/2, 0.2)
2008-06-11 09:56:36,129-0500 DEBUG 
WeightedHostScoreScheduler Old score: -0.200, new score: 
0.000
2008-06-11 09:56:36,129-0500 INFO  LateBindingScheduler Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193771) 
Completed. Waiting: 0, Running: 0. Heap size: 12M, Heap 
free: 2M, Max heap: 63M
2008-06-11 09:56:36,130-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:0.000
(1.000):1/2, -0.2)
2008-06-11 09:56:36,130-0500 DEBUG 
WeightedHostScoreScheduler Old score: 0.000, new score: -
0.200
2008-06-11 09:56:36,131-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193773) 
setting status to Submitting
2008-06-11 09:56:36,131-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193773) 
setting status to Submitted
2008-06-11 09:56:36,131-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193773) 
setting status to Active
2008-06-11 09:56:36,345-0500 DEBUG TaskImpl Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193773) 
setting status to Completed
2008-06-11 09:56:36,345-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:-0.200
(0.889):1/2, 0.2)
2008-06-11 09:56:36,345-0500 DEBUG 
WeightedHostScoreScheduler Old score: -0.200, new score: 
0.000
2008-06-11 09:56:36,346-0500 INFO  LateBindingScheduler Task
(type=FILE_TRANSFER, identity=urn:0-0-1-1213196193773) 
Completed. Waiting: 0, Running: 0. Heap size: 12M, Heap 
free: 1M, Max heap: 63M
2008-06-11 09:56:36,347-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:0.000
(1.000):1/2, -0.01)
2008-06-11 09:56:36,347-0500 DEBUG 
WeightedHostScoreScheduler Old score: 0.000, new score: -
0.010
2008-06-11 09:56:36,348-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193775) 
setting status to Submitting
2008-06-11 09:56:36,348-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193775) 
setting status to Submitted
2008-06-11 09:56:36,348-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193775) 
setting status to Active
2008-06-11 09:56:36,378-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193775) 
setting status to Completed
2008-06-11 09:56:36,378-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:-0.010
(0.994):1/2, 0.01)
2008-06-11 09:56:36,378-0500 DEBUG 
WeightedHostScoreScheduler Old score: -0.010, new score: 
0.000
2008-06-11 09:56:36,378-0500 INFO  LateBindingScheduler Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193775) 
Completed. Waiting: 0, Running: 0. Heap size: 12M, Heap 
free: 1M, Max heap: 63M
2008-06-11 09:56:36,379-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:0.000
(1.000):1/2, -0.01)
2008-06-11 09:56:36,380-0500 DEBUG 
WeightedHostScoreScheduler Old score: 0.000, new score: -
0.010
2008-06-11 09:56:36,380-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193777) 
setting status to Submitting
2008-06-11 09:56:36,380-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193777) 
setting status to Submitted
2008-06-11 09:56:36,380-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193777) 
setting status to Active
2008-06-11 09:56:36,408-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193777) 
setting status to Completed
2008-06-11 09:56:36,408-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:-0.010
(0.994):1/2, 0.01)
2008-06-11 09:56:36,408-0500 DEBUG 
WeightedHostScoreScheduler Old score: -0.010, new score: 
0.000
2008-06-11 09:56:36,409-0500 INFO  LateBindingScheduler Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193777) 
Completed. Waiting: 0, Running: 0. Heap size: 12M, Heap 
free: 1M, Max heap: 63M
2008-06-11 09:56:36,410-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:0.000
(1.000):1/2, -0.01)
2008-06-11 09:56:36,410-0500 DEBUG 
WeightedHostScoreScheduler Old score: 0.000, new score: -
0.010
2008-06-11 09:56:36,410-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193779) 
setting status to Submitting
2008-06-11 09:56:36,410-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193779) 
setting status to Submitted
2008-06-11 09:56:36,410-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193779) 
setting status to Active
2008-06-11 09:56:36,437-0500 DEBUG TaskImpl Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193779) 
setting status to Completed
2008-06-11 09:56:36,437-0500 DEBUG 
WeightedHostScoreScheduler multiplyScore(GLOW:-0.010
(0.994):1/2, 0.01)
2008-06-11 09:56:36,437-0500 DEBUG 
WeightedHostScoreScheduler Old score: -0.010, new score: 
0.000
2008-06-11 09:56:36,437-0500 INFO  LateBindingScheduler Task
(type=FILE_OPERATION, identity=urn:0-0-1-1213196193779) 
Completed. Waiting: 0, Running: 0. Heap size: 12M, Heap 
free: 1M, Max heap: 63M
2008-06-11 09:56:36,441-0500 INFO  vdl:initshareddir END 
host=GLOW - Done initializing shared directory

It seems that there are multiple FILE_ OPERATION and 
FILE_TRANSFER for the same job when initializing shared 
directory, what does this mean?



More information about the Swift-devel mailing list