[Swift-user] Need help debugging strange problem...
Andriy Fedorov
fedorov at cs.wm.edu
Thu Aug 7 09:47:52 CDT 2008
Hi,
I have a Swift script that is running fine on UC TG site, and now I am
trying to add NCSA to the set of execution sites, but I have some
strange problems, and I am not sure how to debug this.
First, I submit a simple script (below) to NCSA Mercury with GT4 Fork
jobmanager, and it works. When I change the provider from "fork" to
"PBS", the Swift execution does not finish after the PBS job
completion. I see the job submitted, queued in PBS, running,
completing, I see the output file is produced in the scratch
directory, but on the submission site I have "Progress: Executing:1".
The submission site is the same as for the example with "fork"
jobmanager, so I don't see how firewall can be an issue, and I can
telnet to the submission site from NCSA.
Note, that I was able to run the same simple test with both fork and
PBS providers on the SDSC TG site.
How can I figure out what is wrong about NCSA Mercury?
sites.xml: (as in http://www.teragrid.org/userinfo/jobs/gram.php)
<pool handle="NCSA-GT4">
<gridftp url="gsiftp://gridftp-hg.ncsa.teragrid.org:2811/" />
<execution provider="gt4" jobmanager="PBS" <=========== HERE
I change PBS/fork
url="https://grid-hg.ncsa.teragrid.org:8443/wsrf/services/ManagedJobFactoryService"/>
<workdirectory>/home/ac/fedorov/scratch</workdirectory>
</pool>
tc.data:
NCSA-GT4 NCSA_hostname /sbin/ifconfig INSTALLED INTEL32::LINUX null
hello.swift:
type messagefile{}
(messagefile uc_hostname) hostname2(){
app{
NCSA_hostname stdout=@filename(uc_hostname);
}
}
messagefile uc_hostname<"uc_hostname.txt">;
messagefile ncsa_hostname<"ncsa_hostname.txt">;
ncsa_hostname = hostname2();
--
Andrey Fedorov
Center for Real-Time Computing
College of William and Mary
http://www.cs.wm.edu/~fedorov
More information about the Swift-user
mailing list