[Swift-user] Running first.swift remotely on NCSA

Andriy Fedorov fedorov at cs.wm.edu
Fri Jun 13 08:27:58 CDT 2008


Michael,

Thank you for the reply. Unfortunately, your suggestions didn't help.

>> <pool handle="Mercury" >
>>     <gridftp  url="gsiftp://gridftp-hg.ncsa.teragrid.org" />
>>     <jobmanager universe="vanilla"
>> url="grid-hg.ncsa.teragrid.org/jobmanager" major="2" />
>
> That doesn't look right. You need a specific job manager, such as "fork"
> or "pbs". I'd recommend trying "fork" for simple testing.
>

I got "grid-hg.ncsa.teragrid.org/jobmanager" from here:
http://www.teragrid.org/userinfo/jobs/gram.php I think it is just an
alias for "fork". I substituted "jobmanager" with "jobmanager-fork",
but everything is the same way.

I also tried to use "jobmanager-pbs", and I could see my job in the
queue, but the same result "Progress:  Executing:1" on the client
host.

>> [fedorov at ri vdsk] swift first.swift
>> Swift 0.5 swift-r1783 cog-r1962
>>
>> RunID: 20080612-2101-yvp36l3c
>> Progress:
>> echo started
>> Progress:  Executing:1
>> Progress:  Executing:1
>> Progress:  Executing:1
>> Progress:  Executing:1
>> Progress:  Executing:1
>> Progress:  Executing:1
>
> This may happen if the callback address for you submit host is unknown
> to the GRAM service or if you're behind a fierewall or NAT. If you're
> not, try setting $GLOBUS_HOSTNAME with your DNS address or IP.
>

No, I have a valid IP. I do have $GLOBUS_HOSTNAME set now, but this
doesn't help.

I did some more looking around, and I found directories named
"first-<date>-<time>-<RunID>" in the scratch directory on Mercury. The
log in "info" directory looks good to me:

Progress  2008-06-12 20:18:06.%N-0500  LOG_START

_____________________________________________________________________________

        Wrapper
_____________________________________________________________________________

DIR=jobs/5/echo-5zu081ui
EXEC=/bin/echo
STDIN=
STDOUT=hello.txt
STDERR=stderr.txt
DIRS=
INF=
OUTF=hello.txt
KICKSTART=
ARGS=Hello, world!
Progress  2008-06-12 20:18:06.%N-0500  CREATE_JOBDIR
Created job directory: jobs/5/echo-5zu081ui
Progress  2008-06-12 20:18:06.%N-0500  CREATE_INPUTDIR
Progress  2008-06-12 20:18:06.%N-0500  LINK_INPUTS
Progress  2008-06-12 20:18:06.%N-0500  EXECUTE
Progress  2008-06-12 20:18:06.%N-0500  EXECUTE_DONE
Job ran successfully
Progress  2008-06-12 20:18:06.%N-0500  COPYING_OUTPUTS
Progress  2008-06-12 20:18:06.%N-0500  RM_JOBDIR
Progress  2008-06-12 20:18:06.%N-0500  TOUCH_SUCCESS
Progress  2008-06-12 20:18:06.%N-0500  END

Note that I had this same successful (?) run when I used plain "jobmanager".

It seems that there is a problem returning the result back. I tried to
set GLOBUS_TCP_PORT_RANGE to 45000,45100, as suggested here
http://www-128.ibm.com/developerworks/cn/grid/gr-gsi4intro/index_eng.html,
but this also was of no help.

Does anybody know what is wrong?

Fedorov



More information about the Swift-user mailing list