[Swift-devel] feature request

Zhao Zhang zhaozhang at uchicago.edu
Tue Apr 21 11:22:04 CDT 2009


Also, I found my grid proxy expired.

[zzhang at communicado ~]$ grid-proxy-init
Your identity: /DC=org/DC=doegrids/OU=People/CN=Zhao Zhang 385894
Enter GRID pass phrase for this identity:
Creating proxy ........................................... Done


ERROR: Your certificate has expired: Thu Feb 26 12:47:51 2009




Michael Wilde wrote:
> [meant to cc this to swift-devel]
>
> Two separate questions here:
>
> 1) what to do:
>
> Test coasters on a set of agreed upon (and growing list of) sites:
>
> Start with these:
>
> localhost:
> TG: teraport, ucanl, mercury, sdsc, abe, queenbee, ranger
> OSG: red.unl.edu, some wisc.edu site (pick 3 for now)
> Local: HNL cluster (gwynn)
>
> Thats a good list to start for 0.9.  We can easily grow this once you
> get that far.
>
> If I missed a high prio target please let me know.
>
> Others in the wings: Jazz, MCS kBT cluster, TG bigred and purdue condor;
> many more OSG sites.
>
>
> 2) what failed
>
> I didnt look in your log yet, but be aware that you need a proxy to run
> coasters even on localhost (for secure messaging with GSI)
>
> Best to start with localhost testing on communicado, not surveyor
>
>
> On 4/21/09 10:54 AM, Zhao Zhang wrote:
>> Dear All
>>
>> I am trying to run swift on local site. I checked out the latest 
>> swift code, and built it. Started test as below, it failed.
>> I am not clear with my test goals here, am I making sure coaster is 
>> working on all sites we have, or am I testing the existing
>> coaster could be use by any users?
>>
>> zhao
>>
>> zzhang at login6.surveyor:/home/falkon/swift_coaster/cog/modules/swift/tests/sites> 
>> ./run-site coaster/coaster-local.xml
>> testing site configuration: coaster/coaster-local.xml
>> Removing files from previous runs
>> Running test 061-cattwo at Tue Apr 21 10:51:21 CDT 2009
>> Swift svn swift-r2865 cog-r2388
>>
>> RunID: 20090421-1051-er55uyr8
>> Progress:
>> Multiple entries found for cat on localhost. Using the first one
>> Progress:  Submitted:1
>> Failed to transfer wrapper log from 
>> 061-cattwo-20090421-1051-er55uyr8/info/i on localhost
>> Progress:  Submitted:1
>> Failed to transfer wrapper log from 
>> 061-cattwo-20090421-1051-er55uyr8/info/k on localhost
>> Progress:  Submitted:1
>> Failed to transfer wrapper log from 
>> 061-cattwo-20090421-1051-er55uyr8/info/m on localhost
>> Execution failed:
>>        Exception in cat:
>> Arguments: [061-cattwo.1.in, 061-cattwo.2.in]
>> Host: localhost
>> Directory: 061-cattwo-20090421-1051-er55uyr8/jobs/m/cat-mt8ngp9j
>> stderr.txt:
>>
>> stdout.txt:
>>
>> ----
>>
>> Caused by:
>>        Could not submit job
>> Caused by:
>>        Could not start coaster service
>> Caused by:
>>        Task ended before registration was received.
>> STDOUT:
>> STDERR: which: no gmd5sum in 
>> (/home/falkon/swift_coaster/cog/modules/swift/dist/swift-svn/bin:/home/zzhang/ruby-1.8.7-p72/bin/bin:/home/zzhang/chirp/bin:/home/zzhang/gridftp/bin:/home/zzhang/gridftp/sbin:/home/zzhang/xar/bin:/home/falkon/swift_scratch/cog/modules/swift/dist/swift-svn/bin:/home/falkon/falkon/bin:/home/falkon/falkon/service:/home/falkon/falkon/worker:/home/falkon/falkon/client:/home/falkon/falkon/monitor:/home/falkon/falkon/webserver:/home/falkon/falkon/ploticus/src:/home/falkon/falkon/apache-ant-1.7.0:/home/falkon/falkon/apache-ant-1.7.0/bin:/usr/lib/jvm/java:/usr/lib/jvm/java/bin:/home/falkon/falkon/container:/home/falkon/falkon/container/bin:/bin:/usr/sbin:/etc:/usr/X11R6/bin:/usr/bin:/sbin:/usr/local/bin:/bgsys/drivers/ppcfloor/bin:/bgsys/drivers/ppcfloor/comm/bin:/dbhome/bgpdb2c/sqllib/lib:/opt/ibmcmp/vac/bg/9.0/bin:/opt/ibmcmp/vacpp/bg/9.0/bin:/opt/ibmcmp/xlf/bg/11.1/bin:/software/common/apps/mpiscripts:/software/common/apps/projects-list/bin:/software/common/adm/softenv/bin:/home/ 
>>
>
> zzhang/bin/linux-sles10-ppc64:/home/zzhang/bin:.:/software/common/apps/misc-scripts:/bgsys/drivers/ppcfloor/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin) 
>
>>
>>
>>
>> Caused by:
>>        Job failed with an exit code of 1
>> Cleaning up...
>> Done
>> SWIFT RETURN CODE NON-ZERO - test 061-cattwo
>>
>>
>> Michael Wilde wrote:
>>> Zhao, based on prior discussion of coaster testing on this list, we 
>>> all agree its high priority.
>>>
>>> Can you set aside everything else and focus on it, and let us know 
>>> at the end of the day if you've been able to run the existing site 
>>> tests as a starting point?
>>>
>>> Then you need to test running coasters manually, then (I assume, Ben 
>>> and Mihael) create additional site tests that exercise coasters?
>>>
>>> Mihael, you should provide a list of coaster-specific aspects to test.
>>> The job-time management aspects come to mind, as do coaster cleanup 
>>> and termination.
>>>
>>>
>>> On 4/21/09 3:15 AM, Ben Clifford wrote:
>>>> At the weekend, I put out a release candidate for Swift 0.9 with a 
>>>> 7-day day test period.
>>>>
>>>> Performing coaster testing on that release candidate using the 
>>>> existing Swift site tests, and with additional sites, is something 
>>>> that would be useful to the release process and has a clearly 
>>>> defined short timescale - it is something that should happen before 
>>>> the weekend.
>>>>
>>>> On Thu, 16 Apr 2009, Ben Clifford wrote:
>>>>
>>>>> On Wed, 15 Apr 2009, Michael Wilde wrote:
>>>>>
>>>>>> Zhao, based on Ben's suggestion and our earlier discussion, can 
>>>>>> you locate and
>>>>>> try the Swift test suite, and then look in detail at, and try, 
>>>>>> the per-site
>>>>>> suite.
>>>>>>
>>>>>> Im looking for an assessment of what it takes to create a branch 
>>>>>> of the suite
>>>>>> to test coasters in th same manner that the sites are tested.
>>>>> Put swift on your path and get a proxy.
>>>>>
>>>>> Then:
>>>>>  cd tests/sites
>>>>>  ./run-all coaster/
>>>>>
>>>>> This will start running the tests in the coaster/ subdirectory.
>>>>>
>>>>> Each of the files in there is a site definition. One site test is 
>>>>> run for each of those. When/if all of them have exited, the list 
>>>>> of sites that worked and the list of sites that did not work is 
>>>>> output.
>>>>>
>>>>> If you want to run a single site, say ./run-site 
>>>>> coaster/coaster-local.xml
>>>>> (for example)
>>>>>
>>>>> To add new sites, put an file in the coaster/ subdirectory 
>>>>> containing an appropriate site definition.
>>>>>
>>>>> In order to make site tests that many people can run, I usually 
>>>>> make a remote work directory and chmod a+rwxt on that remote work 
>>>>> directory so no matter who runs, Swift will not encounter 
>>>>> permission problems.
>>>>>
>>>>>> And what it takes to run that on a regular basis to find problems 
>>>>>> in sites and
>>>>>> Swift *before* our users find them.
>>>>> The main problem with the site tests is that they generally need 
>>>>> credentials, and its unclear what the right way to handle long 
>>>>> term testing credentials is.
>>>>>
>>>>>
>>>
>>
>
>




More information about the Swift-devel mailing list