<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
This message was held in our mailing list- is this still an issue?
<br>
<br>
On 6/24/2014 7:32 PM, Hemant Sharma wrote:<br>
</div>
<blockquote cite="mid:53A9B631.70706@anl.gov" type="cite">Hi guys,
<br>
<br>
I'm having a problem with queuing using SWIFT. I have a swift
script, which should execute about 27000 iterations. In order to
limit the initial memory size, I created the following config
file:
<br>
<br>
use.provider.staging=false
<br>
provider.staging.pin.swiftfiles=false
<br>
use.wrapper.staging=false
<br>
status.mode=provider
<br>
wrapperlog.always.transfer=true
<br>
execution.retries=0
<br>
lazy.errors=true
<br>
sitedir.keep=true
<br>
file.gc.enabled=false
<br>
wrapper.parameter.mode=files
<br>
foreach.max.threads=330
<br>
<br>
Some times, when I execute the script, it starts with 320 active
jobs (on 320 processors), but after some time, it just gets stuck
with 330 submitted jobs and none of them are active. Example
output to screen is:
<br>
<br>
Swift 0.94 swift-r6637 cog-r3742
<br>
<br>
RunID: 20140624-1200-qke4z0v8
<br>
Progress: time: Tue, 24 Jun 2014 12:00:30 -0500
<br>
Progress: time: Tue, 24 Jun 2014 12:00:31 -0500 Selecting
site:328 Initializing site shared directory:1 Stage in:1
<br>
Progress: time: Tue, 24 Jun 2014 12:00:32 -0500 Selecting
site:10 Stage in:277 Submitting:3 Submitted:40
<br>
Progress: time: Tue, 24 Jun 2014 12:00:38 -0500 Selecting
site:10 Submitted:319 Active:1
<br>
Progress: time: Tue, 24 Jun 2014 12:00:43 -0500 Selecting
site:10 Active:319 Checking status:1
<br>
Progress: time: Tue, 24 Jun 2014 12:00:44 -0500 Selecting site:1
Stage in:20 Active:208 Checking status:31 Stage out:70
Finished successfully:21
<br>
Progress: time: Tue, 24 Jun 2014 12:00:45 -0500 Stage in:11
Active:120 Checking status:10 Stage out:189 Finished
successfully:31
<br>
Progress: time: Tue, 24 Jun 2014 12:00:46 -0500 Stage in:25
Active:117 Stage out:188 Finished successfully:54
<br>
Progress: time: Tue, 24 Jun 2014 12:00:47 -0500 Initializing:1
Selecting site:1 Stage in:46 Active:118 Stage out:164 Finished
successfully:86
<br>
Progress: time: Tue, 24 Jun 2014 12:00:48 -0500 Selecting site:2
Stage in:102 Submitting:1 Submitted:2 Active:165 Checking
status:1 Stage out:57 Finished successfully:199
<br>
Progress: time: Tue, 24 Jun 2014 12:00:49 -0500 Submitted:5
Active:324 Checking status:1 Finished successfully:265
<br>
Progress: time: Tue, 24 Jun 2014 12:00:50 -0500 Submitted:12
Active:317 Checking status:1 Finished successfully:272
<br>
Progress: time: Tue, 24 Jun 2014 12:00:51 -0500 Submitted:22
Active:307 Finished successfully:283
<br>
Progress: time: Tue, 24 Jun 2014 12:00:52 -0500 Selecting site:1
Stage in:13 Submitted:47 Active:223 Stage out:46 Finished
successfully:321
<br>
Progress: time: Tue, 24 Jun 2014 12:00:53 -0500 Stage in:28
Submitted:73 Active:153 Stage out:75 Finished successfully:362
<br>
Progress: time: Tue, 24 Jun 2014 12:00:55 -0500 Submitted:182
Active:147 Checking status:1 Finished successfully:442
<br>
Progress: time: Tue, 24 Jun 2014 12:00:57 -0500 Submitted:183
Active:146 Checking status:1 Finished successfully:443
<br>
Progress: time: Tue, 24 Jun 2014 12:01:00 -0500 Submitted:185
Active:144 Checking status:1 Finished successfully:445
<br>
Progress: time: Tue, 24 Jun 2014 12:01:01 -0500 Submitted:186
Active:143 Checking status:1 Finished successfully:446
<br>
Progress: time: Tue, 24 Jun 2014 12:01:02 -0500 Submitted:190
Active:139 Checking status:1 Finished successfully:450
<br>
Progress: time: Tue, 24 Jun 2014 12:01:05 -0500 Submitted:193
Active:136 Checking status:1 Finished successfully:453
<br>
Progress: time: Tue, 24 Jun 2014 12:01:07 -0500 Submitted:196
Active:133 Checking status:1 Finished successfully:456
<br>
Progress: time: Tue, 24 Jun 2014 12:01:09 -0500 Submitted:198
Active:131 Checking status:1 Finished successfully:458
<br>
Progress: time: Tue, 24 Jun 2014 12:01:10 -0500 Stage in:5
Submitted:202 Active:63 Stage out:60 Finished successfully:467
<br>
Progress: time: Tue, 24 Jun 2014 12:01:11 -0500 Submitted:273
Active:56 Checking status:1 Finished successfully:533
<br>
Progress: time: Tue, 24 Jun 2014 12:01:13 -0500 Submitted:282
Active:47 Checking status:1 Finished successfully:542
<br>
Progress: time: Tue, 24 Jun 2014 12:01:14 -0500 Submitting:1
Submitted:292 Active:37 Finished successfully:553
<br>
Progress: time: Tue, 24 Jun 2014 12:01:15 -0500 Submitted:298
Active:31 Checking status:1 Finished successfully:558
<br>
Progress: time: Tue, 24 Jun 2014 12:01:16 -0500 Submitted:305
Active:24 Checking status:1 Finished successfully:565
<br>
Progress: time: Tue, 24 Jun 2014 12:01:17 -0500 Submitted:307
Active:22 Checking status:1 Finished successfully:567
<br>
Progress: time: Tue, 24 Jun 2014 12:01:18 -0500 Submitted:313
Active:16 Checking status:1 Finished successfully:573
<br>
Progress: time: Tue, 24 Jun 2014 12:01:20 -0500 Submitted:315
Active:14 Checking status:1 Finished successfully:575
<br>
Progress: time: Tue, 24 Jun 2014 12:01:21 -0500 Submitted:317
Active:12 Checking status:1 Finished successfully:577
<br>
Progress: time: Tue, 24 Jun 2014 12:01:22 -0500 Submitted:319
Active:10 Checking status:1 Finished successfully:579
<br>
Progress: time: Tue, 24 Jun 2014 12:01:23 -0500 Submitted:320
Active:9 Checking status:1 Finished successfully:580
<br>
Progress: time: Tue, 24 Jun 2014 12:01:25 -0500 Submitted:323
Active:6 Checking status:1 Finished successfully:583
<br>
Progress: time: Tue, 24 Jun 2014 12:01:26 -0500 Submitted:324
Active:5 Checking status:1 Finished successfully:584
<br>
Progress: time: Tue, 24 Jun 2014 12:01:27 -0500 Submitted:325
Active:4 Checking status:1 Finished successfully:585
<br>
Progress: time: Tue, 24 Jun 2014 12:01:29 -0500 Submitted:326
Active:3 Checking status:1 Finished successfully:586
<br>
Progress: time: Tue, 24 Jun 2014 12:01:36 -0500 Submitted:327
Active:2 Checking status:1 Finished successfully:587
<br>
Progress: time: Tue, 24 Jun 2014 12:01:39 -0500 Submitted:328
Active:1 Checking status:1 Finished successfully:588
<br>
Progress: time: Tue, 24 Jun 2014 12:01:50 -0500 Submitted:329
Checking status:1 Finished successfully:589
<br>
Progress: time: Tue, 24 Jun 2014 12:02:00 -0500 Submitted:330
Finished successfully:590
<br>
Progress: time: Tue, 24 Jun 2014 12:02:30 -0500 Submitted:330
Finished successfully:590
<br>
Progress: time: Tue, 24 Jun 2014 12:03:00 -0500 Submitted:330
Finished successfully:590
<br>
Progress: time: Tue, 24 Jun 2014 12:03:30 -0500 Submitted:330
Finished successfully:590
<br>
Progress: time: Tue, 24 Jun 2014 12:04:00 -0500 Submitted:330
Finished successfully:590
<br>
<br>
The issue is not really reproducible, nor is the number of
successful jobs. Any ideas how to solve this problem? I'm
attaching the log file.
<br>
<br>
Thanks,
<br>
Hemant
<br>
<br>
Hemant Sharma
<br>
Post-doctoral Researcher
<br>
Advanced Photon Source
<br>
Argonne National Laboratory
<br>
Lemont IL 60429
<br>
USA
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Swift-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a></pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Justin M Wozniak
</pre>
</body>
</html>