[Swift-user] Need help debugging strange problem...
Andriy Fedorov
fedorov at cs.wm.edu
Thu Aug 7 11:23:53 CDT 2008
Ben,
I tried what you suggested, and I have globusrun-ws working from UC
submitting to NCSA, using Fork factory type:
[fedorov at TG/UC:tg-login1 ~/swiftBiofem] globusrun-ws -submit -F
https://grid-hg.ncsa.teragrid.org:8443/wsrf/services/ManagedJobFactoryService
-Ft Fork -job-command /bin/hostname
Submitting job...Done.
Job ID: uuid:3b8f1662-649c-11dd-9347-0007e9d811ce
Termination time: 08/08/2008 16:16 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
But it fails when I am using PBS factory. globusrun-ws doesn't exit,
while I see job finished on NCSA.
[fedorov at TG/UC:tg-login1 ~/swiftBiofem] globusrun-ws -submit -F
https://grid-hg.ncsa.teragrid.org:8443/wsrf/services/ManagedJobFactoryService
-Ft PBS -job-command /bin/hostname
Submitting job...Done.
Job ID: uuid:dc23433c-649c-11dd-9671-0007e9d811ce
Termination time: 08/08/2008 16:21 GMT
Current job state: Unsubmitted
I am going to report this to TG help.
--
Andrey Fedorov
Center for Real-Time Computing
College of William and Mary
http://www.cs.wm.edu/~fedorov
On Thu, Aug 7, 2008 at 11:27 AM, Ben Clifford <benc at hawaga.org.uk> wrote:
> there is a somewhat common misconfiguration of gram4 on the server side
> where it is wired into the local queueing system incorrectly so that
> completion notifications do not find their way back. this matches the
> symptoms you describe - that fork works but that pbs doesn't, but that the
> job apepars to have run.
>
> I just tried a submission using the GT4 command line job submission
> command:
>
> $ globusrun-ws -submit -F
> https://grid-hg.ncsa.teragrid.org:8443/wsrf/services/ManagedJobFactoryService
> -Ft Fork -job-command /bin/hostname
> Submitting job...
>
>
>
> but it appears to hang without submitting. not sure what is happening with
> that site...
>
> Aside from that, my advice for diagnosis would be to try the above command
> with both Fork and PBS and see if you get the same difference in behaviour
> between the two.
>
> --
>
More information about the Swift-user
mailing list