[Swift-devel] feature request

Michael Wilde wilde at mcs.anl.gov
Wed Apr 29 11:05:21 CDT 2009


Separately, to test on both OSG and TG, you need to have a good 
understanding of where data and apps live, and how various directories 
are intended to be used.

OSG has a set of generic storage locations (tmp, app, data, etc), and TG 
has basically home and scratch for each user, with some provision for 
group directories.

We should document the relationships between what the grids specify, how 
swift users should use the various dirs, and how the tests use those 
dirs (which should test what we tell the users to do).

The site tools and the tests need to be (and I think are) are already 
cognizant of this. But theres a few variations possible, like putting 
apps under $APP vs $DATA, and how writability is based on your VO, and 
how and when subdirectories should be used.

We should locate the relevant Grid doc pages on these topics, and link 
to them from the "local details" section of the Swift user guide.

This isnt very complicated once the following are documented:

- each grids conventions for directories
- permissions and multi-user issues
- transience and space management issues
- group sharing of various directories

In the current simple Swift model where permanent data lives on the 
submit host or a specified server and the workdirectory is always 
transient, this is not much of an issue.

In the future, Swift might support the caching of files on sites between 
workflows, and then we'll need to have a bit more complex model.

What I'm saying here is perhaps not precise enough, but I believe its 
important enough, and confusing enough to most users, that we should 
document it clealy and provide pointers and examples.

- Mike



On 4/29/09 10:52 AM, Michael Wilde wrote:
> The message:
> 
> Could not find any valid host for task "Task(type=UNKNOWN, 
> identity=urn:cog-1241018496704)" with constraints 
> {filenames=[Ljava.lang.String;@7f38f3d1, trfqn=cat, 
> filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 740f5f97, 
> tr=cat}
> 
> means there was no tc.data entry for cat on any site.
> 
> So I suspect this is due to your pool handles not natching your tc.data 
> site names.
> 
> On 4/29/09 10:35 AM, Zhao Zhang wrote:
>> Hi, again.
>>
>> Now I have two sites.xml for renci-engagement. The first one is 
>> working, while the second is not. The second sites.xml is generated by 
>> the script.
>> Since the only difference between the first and the second is the 
>> <workdirectory> value, I am assuming the error is caused by the 
>> <workdirectory>.
>> If this is true, how could I find all valid <workdirectory> for all 
>> sites on OSG?
>>
>> zhao
>>
>> First:
>> <config>
>>   <pool handle="renci-engage">
>>      <gridftp  url="gsiftp://belhaven-1.renci.org/" />
>>      <execution provider="condor" />
>>      <workdirectory >/nfs/home/osgedu/benc</workdirectory>
>>      <profile namespace="globus" key="jobType">grid</profile>
>>      <profile namespace="globus" key="gridResource">gt2 
>> belhaven-1.renci.org/jobmanager-fork</profile>
>>   </pool>
>> </config>
>>
>> Second:
>> <config>
>>  <!-- RENCI-Engagement -->
>>  <pool handle="RENCI-Engagement" >
>>    <gridftp  url="gsiftp://belhaven-1.renci.org/" />
>>    <execution  provider="condor" />
>>    <workdirectory 
>>  >/nfs/osg-data/engage/tmp/RENCI-Engagement</workdirectory>
>>    <profile namespace="globus" key="jobType">grid</profile>
>>    <profile namespace="globus" key="gridResource">gt2 
>> belhaven-1.renci.org/jobmanager-fork</profile>
>>  </pool>
>> </config>
>>
>> The error message is:
>> [zzhang at tp-grid1 sites]$ ./run-site condor-g/RENCI-Engagement.xml
>> testing site configuration: condor-g/RENCI-Engagement.xml
>> Removing files from previous runs
>> Running test 061-cattwo at Wed Apr 29 10:21:31 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090429-1021-8ayf7v75
>> Progress:
>> Execution failed:
>>        Could not find any valid host for task "Task(type=UNKNOWN, 
>> identity=urn:cog-1241018496704)" with constraints 
>> {filenames=[Ljava.lang.String;@7f38f3d1, trfqn=cat, 
>> filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 740f5f97, 
>> tr=cat}
>> SWIFT RETURN CODE NON-ZERO - test 061-cattwo
>>
>>
>> Ben Clifford wrote:
>>> On Wed, 29 Apr 2009, Zhao Zhang wrote:
>>>
>>>  
>>>> I modified the existing swift-osg-ress-site-catalog, and generate a 
>>>> sample
>>>> sites.xml at CI network
>>>> /home/zzhang/swift_coaster/cog/modules/swift/tests/sites/condor-g/sample-sites.xml 
>>>>
>>>>
>>>> Could you help me check if there is any known error in there? thanks.
>>>>     
>>>
>>> I looked briefly and it seems ok
>>>
>>>   
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list