[Swift-devel] feature request
Michael Wilde
wilde at mcs.anl.gov
Tue Apr 21 11:15:07 CDT 2009
[meant to cc this to swift-devel]
Two separate questions here:
1) what to do:
Test coasters on a set of agreed upon (and growing list of) sites:
Start with these:
localhost:
TG: teraport, ucanl, mercury, sdsc, abe, queenbee, ranger
OSG: red.unl.edu, some wisc.edu site (pick 3 for now)
Local: HNL cluster (gwynn)
Thats a good list to start for 0.9. We can easily grow this once you
get that far.
If I missed a high prio target please let me know.
Others in the wings: Jazz, MCS kBT cluster, TG bigred and purdue condor;
many more OSG sites.
2) what failed
I didnt look in your log yet, but be aware that you need a proxy to run
coasters even on localhost (for secure messaging with GSI)
Best to start with localhost testing on communicado, not surveyor
On 4/21/09 10:54 AM, Zhao Zhang wrote:
> Dear All
>
> I am trying to run swift on local site. I checked out the latest swift
> code, and built it. Started test as below, it failed.
> I am not clear with my test goals here, am I making sure coaster is
> working on all sites we have, or am I testing the existing
> coaster could be use by any users?
>
> zhao
>
> zzhang at login6.surveyor:/home/falkon/swift_coaster/cog/modules/swift/tests/sites>
> ./run-site coaster/coaster-local.xml
> testing site configuration: coaster/coaster-local.xml
> Removing files from previous runs
> Running test 061-cattwo at Tue Apr 21 10:51:21 CDT 2009
> Swift svn swift-r2865 cog-r2388
>
> RunID: 20090421-1051-er55uyr8
> Progress:
> Multiple entries found for cat on localhost. Using the first one
> Progress: Submitted:1
> Failed to transfer wrapper log from
> 061-cattwo-20090421-1051-er55uyr8/info/i on localhost
> Progress: Submitted:1
> Failed to transfer wrapper log from
> 061-cattwo-20090421-1051-er55uyr8/info/k on localhost
> Progress: Submitted:1
> Failed to transfer wrapper log from
> 061-cattwo-20090421-1051-er55uyr8/info/m on localhost
> Execution failed:
> Exception in cat:
> Arguments: [061-cattwo.1.in, 061-cattwo.2.in]
> Host: localhost
> Directory: 061-cattwo-20090421-1051-er55uyr8/jobs/m/cat-mt8ngp9j
> stderr.txt:
>
> stdout.txt:
>
> ----
>
> Caused by:
> Could not submit job
> Caused by:
> Could not start coaster service
> Caused by:
> Task ended before registration was received.
> STDOUT:
> STDERR: which: no gmd5sum in
> (/home/falkon/swift_coaster/cog/modules/swift/dist/swift-svn/bin:/home/zzhang/ruby-1.8.7-p72/bin/bin:/home/zzhang/chirp/bin:/home/zzhang/gridftp/bin:/home/zzhang/gridftp/sbin:/home/zzhang/xar/bin:/home/falkon/swift_scratch/cog/modules/swift/dist/swift-svn/bin:/home/falkon/falkon/bin:/home/falkon/falkon/service:/home/falkon/falkon/worker:/home/falkon/falkon/client:/home/falkon/falkon/monitor:/home/falkon/falkon/webserver:/home/falkon/falkon/ploticus/src:/home/falkon/falkon/apache-ant-1.7.0:/home/falkon/falkon/apache-ant-1.7.0/bin:/usr/lib/jvm/java:/usr/lib/jvm/java/bin:/home/falkon/falkon/container:/home/falkon/falkon/container/bin:/bin:/usr/sbin:/etc:/usr/X11R6/bin:/usr/bin:/sbin:/usr/local/bin:/bgsys/drivers/ppcfloor/bin:/bgsys/drivers/ppcfloor/comm/bin:/dbhome/bgpdb2c/sqllib/lib:/opt/ibmcmp/vac/bg/9.0/bin:/opt/ibmcmp/vacpp/bg/9.0/bin:/opt/ibmcmp/xlf/bg/11.1/bin:/software/common/apps/mpiscripts:/software/common/apps/projects-list/bin:/software/common/adm/softenv/bin:/home/
zzhang/bin/linux-sles10-ppc64:/home/zzhang/bin:.:/software/common/apps/misc-scripts:/bgsys/drivers/ppcfloor/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin)
>
>
>
> Caused by:
> Job failed with an exit code of 1
> Cleaning up...
> Done
> SWIFT RETURN CODE NON-ZERO - test 061-cattwo
>
>
> Michael Wilde wrote:
>> Zhao, based on prior discussion of coaster testing on this list, we
>> all agree its high priority.
>>
>> Can you set aside everything else and focus on it, and let us know at
>> the end of the day if you've been able to run the existing site tests
>> as a starting point?
>>
>> Then you need to test running coasters manually, then (I assume, Ben
>> and Mihael) create additional site tests that exercise coasters?
>>
>> Mihael, you should provide a list of coaster-specific aspects to test.
>> The job-time management aspects come to mind, as do coaster cleanup
>> and termination.
>>
>>
>> On 4/21/09 3:15 AM, Ben Clifford wrote:
>>> At the weekend, I put out a release candidate for Swift 0.9 with a
>>> 7-day day test period.
>>>
>>> Performing coaster testing on that release candidate using the
>>> existing Swift site tests, and with additional sites, is something
>>> that would be useful to the release process and has a clearly defined
>>> short timescale - it is something that should happen before the weekend.
>>>
>>> On Thu, 16 Apr 2009, Ben Clifford wrote:
>>>
>>>> On Wed, 15 Apr 2009, Michael Wilde wrote:
>>>>
>>>>> Zhao, based on Ben's suggestion and our earlier discussion, can you
>>>>> locate and
>>>>> try the Swift test suite, and then look in detail at, and try, the
>>>>> per-site
>>>>> suite.
>>>>>
>>>>> Im looking for an assessment of what it takes to create a branch of
>>>>> the suite
>>>>> to test coasters in th same manner that the sites are tested.
>>>> Put swift on your path and get a proxy.
>>>>
>>>> Then:
>>>> cd tests/sites
>>>> ./run-all coaster/
>>>>
>>>> This will start running the tests in the coaster/ subdirectory.
>>>>
>>>> Each of the files in there is a site definition. One site test is
>>>> run for each of those. When/if all of them have exited, the list of
>>>> sites that worked and the list of sites that did not work is output.
>>>>
>>>> If you want to run a single site, say ./run-site
>>>> coaster/coaster-local.xml
>>>> (for example)
>>>>
>>>> To add new sites, put an file in the coaster/ subdirectory
>>>> containing an appropriate site definition.
>>>>
>>>> In order to make site tests that many people can run, I usually make
>>>> a remote work directory and chmod a+rwxt on that remote work
>>>> directory so no matter who runs, Swift will not encounter permission
>>>> problems.
>>>>
>>>>> And what it takes to run that on a regular basis to find problems
>>>>> in sites and
>>>>> Swift *before* our users find them.
>>>> The main problem with the site tests is that they generally need
>>>> credentials, and its unclear what the right way to handle long term
>>>> testing credentials is.
>>>>
>>>>
>>
>
More information about the Swift-devel
mailing list