[Swift-devel] Strange Problem with TG-UCANL

Ian Foster foster at mcs.anl.gov
Thu Oct 25 13:03:00 CDT 2007


can we decide that we always use kickstart?

Andrew Robert Jamieson wrote:
> Thanks for the suggestion, unfortunately I am not using kickstart.
>
> On Thu, 25 Oct 2007, Veronika Nefedova wrote:
>
>> If you are using kickstart - try to use this setting (on TG-UC):
>> gridlaunch="/home/nefedova/pegasus/src/tools/kickstart/kickstart" in 
>> your site.xml. file ( replace the one you have with this one)
>>
>> Nika
>>
>> On Oct 25, 2007, at 10:50 AM, Andrew Robert Jamieson wrote:
>>
>>> Any thoughts on why this would happen on a simple "hello world"
>>> (see below)
>>> Thanks,
>>> Andrew
>>>
>>>
>>> ********************
>>> andrewj at tg-viz-login1:~/CADGrid/Swifty/vdsk-0.3-dev/examples/vdsk> 
>>> swift -debug -tc.file ~/CADGrid/Swifty/UCANL-tc.data -sites.file 
>>> ~/.swift/sites.xml first.swift
>>> Recompilation suppressed.
>>> Using sites  /home/andrewj/.swift/sites.xml
>>> Using tc.data: /home/andrewj/CADGrid/Swifty/UCANL-tc.data
>>> Swift v0.3-dev r1339
>>>
>>> Swift v0.3-dev r1339
>>>
>>> RunID: 20071025-1044-zo4kzfjg
>>> RunID: 20071025-1044-zo4kzfjg
>>> echo started
>>> START thread=0 tr=echo
>>> START host=UCANL - Initializing shared directory
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080146) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080146) setting 
>>> status to Completed
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080149) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080149) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080149) setting 
>>> status to Completed
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080153) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080153) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080153) setting 
>>> status to Completed
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080156) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080156) setting 
>>> status to Completed
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080158) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080158) setting 
>>> status to Completed
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080160) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080160) setting 
>>> status to Completed
>>> END host=UCANL - Done initializing shared directory
>>> THREAD_ASSOCIATION jobid=echo-0gj1k5ji thread=0 host=UCANL
>>> START jobid=echo-0gj1k5ji host=UCANL - Initializing directory structure
>>> START path= dir=first-20071025-1044-zo4kzfjg/shared - Creating 
>>> directory structure
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080162) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080162) setting 
>>> status to Completed
>>> END jobid=echo-0gj1k5ji - Done initializing directory structure
>>> START jobid=echo-0gj1k5ji - Staging in files
>>> END jobid=echo-0gj1k5ji - Staging in finished
>>> JOB_START jobid=echo-0gj1k5ji tr=echo arguments=[Hello, world!] 
>>> tmpdir=first-20071025-1044-zo4kzfjg/echo-0gj1k5ji host=UCANL
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080164) setting 
>>> status to Submitted
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080164) setting 
>>> status to Active
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080164) setting 
>>> status to Completed
>>> START jobid=echo-0gj1k5ji
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080166) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080166) setting 
>>> status to Failed 
>>> org.globus.cog.abstraction.impl.file.FileResourceException: Cannot 
>>> delete /disks/scratchgpfs1/andrewj/first-20071025-1044-zo4kzfjg/ 
>>> status/echo-0gj1k5ji-success
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080168) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080168) setting 
>>> status to Completed
>>> NO_STATUS_FILE jobid=echo-0gj1k5ji - Both status files are missing
>>> APPLICATION_EXCEPTION jobid=echo-0gj1k5ji - Application exception: 
>>> No status file was found. Check the shared filesystem on UCANL
>>>         sys:throw @ vdl-int.k, line: 96
>>>         sys:else @ vdl-int.k, line: 94
>>>         sys:if @ vdl-int.k, line: 82
>>>         sys:try @ vdl-int.k, line: 70
>>>         vdl:checkjobstatus @ vdl-int.k, line: 379
>>>         sys:sequential @ vdl-int.k, line: 355
>>>         sys:try @ vdl-int.k, line: 354
>>>         task:allocatehost @ vdl-int.k, line: 336
>>>         vdl:execute2 @ execute-default.k, line: 23
>>>         sys:restartonerror @ execute-default.k, line: 21
>>>         sys:sequential @ execute-default.k, line: 19
>>>         sys:try @ execute-default.k, line: 18
>>>         sys:if @ execute-default.k, line: 17
>>>         sys:then @ execute-default.k, line: 16
>>>         sys:if @ execute-default.k, line: 15
>>>         vdl:execute @ first.kml, line: 16
>>>         greeting @ first.kml, line: 43
>>>         vdl:mainp @ first.kml, line: 42
>>>         mainp @ vdl.k, line: 148
>>>         vdl:mains @ first.kml, line: 41
>>>         vdl:mains @ first.kml, line: 41
>>>         rlog:restartlog @ first.kml, line: 39
>>>         kernel:project @ first.kml, line: 2
>>>         first-20071025-1044-zo4kzfjg
>>>
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080170) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080170) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080170) setting 
>>> status to Failed Exception in getFile
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080173) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080173) setting 
>>> status to Completed
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080176) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080176) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080176) setting 
>>> status to Failed Exception in getFile
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080179) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080179) setting 
>>> status to Completed
>>> THREAD_ASSOCIATION jobid=echo-1gj1k5ji thread=0 host=UCANL
>>> START jobid=echo-1gj1k5ji host=UCANL - Initializing directory structure
>>> END jobid=echo-1gj1k5ji - Done initializing directory structure
>>> START jobid=echo-1gj1k5ji - Staging in files
>>> END jobid=echo-1gj1k5ji - Staging in finished
>>> JOB_START jobid=echo-1gj1k5ji tr=echo arguments=[Hello, world!] 
>>> tmpdir=first-20071025-1044-zo4kzfjg/echo-1gj1k5ji host=UCANL
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080183) setting 
>>> status to Submitted
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080183) setting 
>>> status to Active
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080183) setting 
>>> status to Completed
>>> START jobid=echo-1gj1k5ji
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080185) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080185) setting 
>>> status to Failed 
>>> org.globus.cog.abstraction.impl.file.FileResourceException: Cannot 
>>> delete /disks/scratchgpfs1/andrewj/first-20071025-1044-zo4kzfjg/ 
>>> status/echo-1gj1k5ji-success
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080187) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080187) setting 
>>> status to Completed
>>> NO_STATUS_FILE jobid=echo-1gj1k5ji - Both status files are missing
>>> APPLICATION_EXCEPTION jobid=echo-1gj1k5ji - Application exception: 
>>> No status file was found. Check the shared filesystem on UCANL
>>>         sys:throw @ vdl-int.k, line: 96
>>>         sys:else @ vdl-int.k, line: 94
>>>         sys:if @ vdl-int.k, line: 82
>>>         sys:try @ vdl-int.k, line: 70
>>>         vdl:checkjobstatus @ vdl-int.k, line: 379
>>>         sys:sequential @ vdl-int.k, line: 355
>>>         sys:try @ vdl-int.k, line: 354
>>>         task:allocatehost @ vdl-int.k, line: 336
>>>         vdl:execute2 @ execute-default.k, line: 23
>>>         sys:restartonerror @ execute-default.k, line: 21
>>>         sys:sequential @ execute-default.k, line: 19
>>>         sys:try @ execute-default.k, line: 18
>>>         sys:if @ execute-default.k, line: 17
>>>         sys:then @ execute-default.k, line: 16
>>>         sys:if @ execute-default.k, line: 15
>>>         vdl:execute @ first.kml, line: 16
>>>         greeting @ first.kml, line: 43
>>>         vdl:mainp @ first.kml, line: 42
>>>         mainp @ vdl.k, line: 148
>>>         vdl:mains @ first.kml, line: 41
>>>         vdl:mains @ first.kml, line: 41
>>>         rlog:restartlog @ first.kml, line: 39
>>>         kernel:project @ first.kml, line: 2
>>>         first-20071025-1044-zo4kzfjg
>>>
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080189) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080189) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080189) setting 
>>> status to Failed Exception in getFile
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080192) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080192) setting 
>>> status to Completed
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080195) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080195) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1193327080195) setting 
>>> status to Failed Exception in getFile
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080198) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080198) setting 
>>> status to Completed
>>> THREAD_ASSOCIATION jobid=echo-2gj1k5ji thread=0 host=UCANL
>>> START jobid=echo-2gj1k5ji host=UCANL - Initializing directory structure
>>> END jobid=echo-2gj1k5ji - Done initializing directory structure
>>> START jobid=echo-2gj1k5ji - Staging in files
>>> END jobid=echo-2gj1k5ji - Staging in finished
>>> JOB_START jobid=echo-2gj1k5ji tr=echo arguments=[Hello, world!] 
>>> tmpdir=first-20071025-1044-zo4kzfjg/echo-2gj1k5ji host=UCANL
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080202) setting 
>>> status to Submitted
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080202) setting 
>>> status to Active
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1193327080202) setting 
>>> status to Completed
>>> START jobid=echo-2gj1k5ji
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080204) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1193327080204) setting 
>>> status to Completed
>>> SUCCESS jobid=echo-2gj1k5ji - Success file found
>>> JOB_END jobid=echo-2gj1k5ji
>>> START jobid=echo-2gj1k5ji - Staging out files
>>> FILE_STAGE_OUT_START srcname=hello.txt srcdir=first-20071025-1044- 
>>> zo4kzfjg/shared/ srchost=UCANL destdir= desthost=localhost 
>>> provider=file
>>> Task(type=FILE_OPERATION, identity=urn:0-1-1193327080206) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1-1193327080206) setting 
>>> status to Completed
>>> Task(type=FILE_TRANSFER, identity=urn:0-1-1193327080209) setting 
>>> status to Submitted
>>> Task(type=FILE_TRANSFER, identity=urn:0-1-1193327080209) setting 
>>> status to Active
>>> Task(type=FILE_TRANSFER, identity=urn:0-1-1193327080209) setting 
>>> status to Completed
>>> FILE_STAGE_OUT_END srcname=hello.txt srcdir=first-20071025-1044- 
>>> zo4kzfjg/shared/ srchost=UCANL destdir= desthost=localhost 
>>> provider=file
>>> Task(type=FILE_OPERATION, identity=urn:0-1-1193327080213) setting 
>>> status to Active
>>> Task(type=FILE_OPERATION, identity=urn:0-1-1193327080213) setting 
>>> status to Completed
>>> END jobid=echo-2gj1k5ji - Staging out finished
>>> echo completed
>>> END_SUCCESS thread=0 tr=echo
>>> START cleanups=[[first-20071025-1044-zo4kzfjg, UCANL]]
>>> START dir=first-20071025-1044-zo4kzfjg host=UCANL
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1-1193327080216) setting 
>>> status to Submitted
>>> Task(type=JOB_SUBMISSION, identity=urn:0-1-1193327080216) setting 
>>> status to Completed
>>> END dir=first-20071025-1044-zo4kzfjg host=UCANL
>>> Swift finished - workflow had no errors
>>>
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>>
>>
>>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>

-- 

   Ian Foster, Director, Computation Institute
Argonne National Laboratory & University of Chicago
Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
      Globus Alliance: www.globus.org.




More information about the Swift-devel mailing list