<div dir="ltr"><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Justin bbt</b> <span dir="ltr"><<a href="mailto:justinbbt@gmail.com">justinbbt@gmail.com</a>></span><br>Date: Fri, Sep 19, 2014 at 9:48 PM<br>Subject: problem with running coaster<br>To: Swift User <<a href="mailto:swift-user@ci.uchicago.edu">swift-user@ci.uchicago.edu</a>><br>Cc: Yadu Nand <<a href="mailto:yadudoc1729@gmail.com">yadudoc1729@gmail.com</a>><br><br><br><div dir="ltr"><div><div><div>Hi all </div><div><br></div><div>I have a problem in running swift over the cloud of Microsoft Azure. I have two nodes on the cloud. I installed swift on one of them and I can run part01-03 locally successfully. I am trying to use the coaster to run part04-part-6 on the other node. Here is the output I get for "swift p4.swift" </div><div><br></div><div><br></div><div>Swift 0.95 RC6 swift-r7900 cog-r3908</div><div>RunID: run005</div><div>Warning: The @ syntax for function invocation is deprecated</div><div>Progress: Sat, 20 Sep 2014 01:36:02+0000</div><div>Exception in thread "Scheduler" java.lang.NullPointerException</div><div><span style="white-space:pre-wrap"> </span>at org.globus.cog.abstraction.impl.common.task.TaskImpl.hashCode(TaskImpl.java:364)</div><div><span style="white-space:pre-wrap"> </span>at java.util.HashMap.hash(HashMap.java:338)</div><div><span style="white-space:pre-wrap"> </span>at java.util.HashMap.get(HashMap.java:556)</div><div><span style="white-space:pre-wrap"> </span>at org.griphyn.vdl.karajan.VDSAdaptiveScheduler.failTask(VDSAdaptiveScheduler.java:400)</div><div><span style="white-space:pre-wrap"> </span>at org.globus.cog.karajan.scheduler.LateBindingScheduler.run(LateBindingScheduler.java:266)</div><div>Progress: Sat, 20 Sep 2014 01:36:03+0000 Selecting site:10</div><div>No events in 1s.</div><div>Finding dependency loops...</div><div><br></div><div>Waiting threads:</div><div>Thread: R-6, waiting on sims (declared on line 21)</div><div><span style="white-space:pre-wrap"> </span>swift:execute, p4, line 96</div><div><span style="white-space:pre-wrap"> </span>analyze, p4, line 211</div><div><br></div><div>Thread: R-5-2-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-7-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-8-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-9-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-3x2, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-0-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-1-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-6-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5-4-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>Thread: R-5x2-3, waiting on simout (declared on line 24)</div><div><span style="white-space:pre-wrap"> </span>assignment, p4, line 28</div><div><br></div><div>----</div><div>No dependency loops found.</div><div><br></div><div>The following threads are independently hung:</div><div>Thread: R-6, waiting on sims (declared on line 21)</div><div><span style="white-space:pre-wrap"> </span>swift:execute, p4, line 96</div><div><span style="white-space:pre-wrap"> </span>analyze, p4, line 211</div><div><br></div><div>----</div><div><br></div><div>Irrecoverable error found. Exiting.</div></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div>This is the content of my coaster-service.conf </div><div><br></div><div>export WORKER_LOCATION=.</div><div>export WORKER_HOSTS="191.238.1.187"</div><div>export WORKER_MODE=ssh</div><div>export WORKER_USERNAME=azureuser</div><div>export IPADDR=191.238.1.33</div><div>export WORKER_LOG_DIR=.</div><div>export WORK=/home/ubuntu/work</div><div>export JOBSPERNODE=1</div><div>export JOBTHROTTLE=10</div><div>export SSH_TUNNELING=yes</div></div><div><br></div><div><br></div><div>And here is the content of start-coaster-service.log </div><br><div><br></div><div><div>Running /home/azureuser/swift-0.95-RC6/bin/coaster-service -nosec -portfile /tmp/tmp.8WkG0mKKzo -localportfile /tmp/tmp.hZvsxQd5J1 -passive</div><div>Switching log to: cps-2014-09-20_01-21-41.log</div><div>2014-09-20 01:21:41,305+0000 WARN CoasterPersistentService Switching log to: cps-2014-09-20_01-21-41.log</div><div>Local contacts: [<a href="http://100.74.60.3:55071" target="_blank">http://100.74.60.3:55071</a>]</div><div>2014-09-20 01:21:41,338+0000 INFO Settings Local contacts: [<a href="http://100.74.60.3:55071" target="_blank">http://100.74.60.3:55071</a>]</div><div>Starting... id=0920-2101410</div><div>2014-09-20 01:21:41,350+0000 INFO BlockQueueProcessor Starting... id=0920-2101410</div><div>Started local service: <a href="http://100.74.60.3:55071" target="_blank">http://100.74.60.3:55071</a></div><div>2014-09-20 01:21:41,350+0000 INFO CoasterService Started local service: <a href="http://100.74.60.3:55071" target="_blank">http://100.74.60.3:55071</a></div><div>Started coaster service: <a href="http://100.74.60.3:42972" target="_blank">http://100.74.60.3:42972</a></div><div>2014-09-20 01:21:41,351+0000 INFO CoasterService Started coaster service: <a href="http://100.74.60.3:42972" target="_blank">http://100.74.60.3:42972</a></div><div>Started coaster service: <a href="http://100.74.60.3:42972" target="_blank">http://100.74.60.3:42972</a></div><div>Worker connection URL: <a href="http://100.74.60.3:55071" target="_blank">http://100.74.60.3:55071</a></div><div>Running ssh -N -T -R *:55071:localhost:55071 <a href="mailto:azureuser@191.238.1.187" target="_blank">azureuser@191.238.1.187</a></div><div>Running ssh <a href="mailto:azureuser@191.238.1.187" target="_blank">azureuser@191.238.1.187</a> mkdir -p . && mkdir -p .</div><div>Running scp /home/azureuser/swift-0.95-RC6/bin/<a href="http://worker.pl" target="_blank">worker.pl</a> <a href="mailto:azureuser@191.238.1.187" target="_blank">azureuser@191.238.1.187</a>:.</div><div>Running ssh <a href="mailto:azureuser@191.238.1.187" target="_blank">azureuser@191.238.1.187</a> WORKER_LOGGING_LEVEL= nohup ./<a href="http://worker.pl" target="_blank">worker.pl</a> <a href="http://191.238.1.33:55071" target="_blank">http://191.238.1.33:55071</a> 191.238.1.187 . &> /dev/null &</div><div>HeapMax: <a href="tel:13134987264" value="+13134987264" target="_blank">13134987264</a>, CrtHeap: 886571008, UsedHeap: 41712768</div><div>2014-09-20 01:21:51,356+0000 INFO CoasterService HeapMax: <a href="tel:13134987264" value="+13134987264" target="_blank">13134987264</a>, CrtHeap: 886571008, UsedHeap: 41712768</div><div>HeapMax: <a href="tel:13134987264" value="+13134987264" target="_blank">13134987264</a>, CrtHeap: 886571008, UsedHeap: 41712768</div><div>2014-09-20 01:22:01,364+0000 INFO CoasterService HeapMax: <a href="tel:13134987264" value="+13134987264" target="_blank">13134987264</a>, CrtHeap: 886571008, UsedHeap: 41712768</div></div><div><br></div><div><br></div><div>I had this problem before with my laptop connecting to a remote server and <span style="font-family:arial,sans-serif;font-size:12.7272720336914px;white-space:nowrap">Yadu Nand Babuji</span><span style="font-family:arial,sans-serif;font-size:12.7272720336914px;white-space:nowrap"> told me it is because I don't have a public ip. </span></div><div><font face="arial, sans-serif"><span style="font-size:12.7272720336914px;white-space:nowrap">But, I have the problem </span><span style="white-space:nowrap">again now with my nodes in the cloud having the public ip. </span></font></div><div><span style="font-family:arial,sans-serif;white-space:nowrap">Though as </span><span style="font-family:arial,sans-serif;font-size:12.7272720336914px;white-space:nowrap"> can be seen in the log, during the connection coaster uses the private ip </span><br></div><div><span style="font-family:arial,sans-serif;font-size:12.7272720336914px;white-space:nowrap">(</span>100.74.60.3<span style="font-family:arial,sans-serif;font-size:12.7272720336914px;white-space:nowrap">)</span></div><div><span style="font-family:arial,sans-serif;font-size:12.7272720336914px;white-space:nowrap"><br></span></div><div><font face="arial, sans-serif"><span style="white-space:nowrap">Is this really a problem with public IP? </span></font></div><div><font face="arial, sans-serif"><span style="white-space:nowrap">Does anybody know how to solve this problem ? </span></font></div><div><font face="arial, sans-serif"><span style="white-space:nowrap"><br></span></font></div></div>
</div><br></div>