[Swift-devel] feature request

Michael Wilde wilde at mcs.anl.gov
Fri Apr 24 14:30:49 CDT 2009


Great, Zhao.

What's next?

Testing of the new condor provider features is important.

For failures, it would be good to see if you can go further in at least 
lifting out the errors so that developers can either tell you if there 
was an error in the testing (which you should fix) or an error in the 
code (which they should fix, and you can help get them the info they 
need, and identify faster which errors might need more immediate attention).

Can you suggest a methodical approach to testing, in terms of:

- what tests you need to and plan to run on what systems?
- how the reports are organized
- how errors are listed and diagnosed

I want this to be a more interactive process between you and the 
developers, not just "it broke, see dir X"

Thanks,

Mike


On 4/24/09 12:52 PM, Zhao Zhang wrote:
> Hi, All
> 
> As I got mapped on gwynn, I redid the tests. The results are
> 
> All language behaviour tests passed
> These sites failed: coaster/tgncsa-hg-coaster-pbs-gram2.xml 
> coaster/tgncsa-hg-coaster-pbs-gram4.xml
> These sites worked: coaster/coaster-local.xml 
> coaster/gwynn-coaster-gram2-gram2-condor.xml 
> coaster/gwynn-coaster-gram2-gram2-fork.xml 
> coaster/renci-engage-coaster.xml coaster/teraport-gt2-gt2-pbs.xml 
> coaster/uj-pbs-gram2.xml
> 
> Logs could be found at 
> /home/zzhang/swift_coaster/cog/modules/swift/tests/sites/log_all on CI 
> network.
> 
> zhao
> 
> Zhao Zhang wrote:
>> Hi, again
>>
>> the test on teraport is successful, here is the log
>>
>> zhao
>>
>> testing site configuration: coaster/teraport-gt2-gt2-pbs.xml
>> Removing files from previous runs
>> Running test 061-cattwo at Thu Apr 23 11:27:19 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1127-aluxx4m9
>> Progress:
>> Progress:  Stage in:1
>> Progress:  Submitted:1
>> Progress:  Submitted:1
>> ...
>> Progress:  Active:1
>> Progress:  Finished successfully:1
>> Final status:  Finished successfully:1
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:57080
>> Got channel MetaChannel: 2129305 -> GSSSChannel-null(1)
>> - Done
>> expecting 061-cattwo.out.expected
>> checking 061-cattwo.out.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 11:57:39 CDT 2009
>> ----------===========================----------
>> Running test 130-fmri at Thu Apr 23 11:57:39 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1157-r8sarc77
>> Progress:
>> Progress:  Selecting site:2  Initializing site shared directory:1  
>> Stage in:1
>> Progress:  Selecting site:2  Stage in:1  Submitting:1
>> Progress:  Selecting site:2  Submitting:1  Submitted:1
>> ...
>> Progress:  Selecting site:2  Submitted:2
>> Progress:  Selecting site:2  Submitted:2
>> Progress:  Selecting site:2  Submitted:2
>> Progress:  Selecting site:2  Submitted:1  Active:1
>> Progress:  Selecting site:2  Active:1  Stage out:1
>> Progress:  Selecting site:1  Stage in:1  Stage out:1  Finished 
>> successfully:1
>> Progress:  Submitted:1  Stage out:1  Finished successfully:2
>> Progress:  Active:1  Finished successfully:4
>> Progress:  Submitting:2  Submitted:1  Finished successfully:5
>> Progress:  Active:2  Stage out:1  Finished successfully:5
>> Progress:  Submitted:1  Stage out:2  Finished successfully:8
>> Final status:  Finished successfully:11
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:52773
>> Got channel MetaChannel: 28761475 -> GSSSChannel-null(1)
>> - Done
>> expecting 130-fmri.0000.jpeg.expected 130-fmri.0001.jpeg.expected 
>> 130-fmri.0002.jpeg.expected
>> checking 130-fmri.0000.jpeg.expected
>> Skipping exception test due to test configuration
>> checking 130-fmri.0001.jpeg.expected
>> Skipping exception test due to test configuration
>> checking 130-fmri.0002.jpeg.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:04:47 CDT 2009
>> ----------===========================----------
>> Running test 103-quote at Thu Apr 23 12:04:47 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1204-sjzpkfd3
>> Progress:
>> Progress:  Stage in:1
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Finished successfully:1
>> Final status:  Finished successfully:1
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:40813
>> Got channel MetaChannel: 28500325 -> GSSSChannel-null(1)
>> - Done
>> expecting 103-quote.out.expected
>> checking 103-quote.out.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:05:05 CDT 2009
>> ----------===========================----------
>> Running test 1032-singlequote at Thu Apr 23 12:05:05 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1205-x2d55af3
>> Progress:
>> Progress:  Stage in:1
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Finished successfully:1
>> Final status:  Finished successfully:1
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:44126
>> Got channel MetaChannel: 18100302 -> GSSSChannel-null(1)
>> - Done
>> expecting 1032-singlequote.out.expected
>> checking 1032-singlequote.out.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:05:22 CDT 2009
>> ----------===========================----------
>> Running test 1031-quote at Thu Apr 23 12:05:22 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1205-5aa1ko4e
>> Progress:
>> Progress:  Stage in:1
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Final status:  Finished successfully:1
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:43759
>> Got channel MetaChannel: 19002607 -> GSSSChannel-null(1)
>> - Done
>> expecting 1031-quote.*.expected
>> No expected output files specified for this test case - not checking 
>> output.
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:05:38 CDT 2009
>> ----------===========================----------
>> Running test 1033-singlequote at Thu Apr 23 12:05:38 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1205-8nopyujc
>> Progress:
>> Progress:  Stage in:1
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Finished successfully:1
>> Final status:  Finished successfully:1
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:39924
>> Got channel MetaChannel: 31196317 -> GSSSChannel-null(1)
>> - Done
>> expecting 1033-singlequote.out.expected
>> checking 1033-singlequote.out.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:05:56 CDT 2009
>> ----------===========================----------
>> Running test 141-space-in-filename at Thu Apr 23 12:05:56 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1205-aalqz1c4
>> Progress:
>> Progress:  Stage in:1
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Finished successfully:1
>> Final status:  Finished successfully:1
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:60177
>> Got channel MetaChannel: 4728458 -> GSSSChannel-null(1)
>> - Done
>> expecting 141-space-in-filename.space here.out.expected
>> checking 141-space-in-filename.space here.out.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:06:15 CDT 2009
>> ----------===========================----------
>> Running test 142-space-and-quotes at Thu Apr 23 12:06:15 CDT 2009
>> Swift 0.9rc2 swift-r2860 cog-r2388
>>
>> RunID: 20090423-1206-8617gag1
>> Progress:
>> Progress:  Selecting site:2  Initializing site shared directory:1  
>> Stage in:1
>> Progress:  Selecting site:2  Submitting:1  Submitted:1
>> Progress:  Selecting site:2  Submitted:1  Active:1
>> Progress:  Selecting site:2  Active:1  Finished successfully:1
>> Progress:  Stage out:1  Finished successfully:3
>> Final status:  Finished successfully:4
>> Cleaning up...
>> Shutting down service at https://128.135.125.118:57945
>> Got channel MetaChannel: 16387060 -> GSSSChannel-null(1)
>> - Done
>> expecting 142-space-and-quotes.2" space ".out.expected 
>> 142-space-and-quotes.3' space '.out.expected 
>> 142-space-and-quotes.out.expected 142-space-and-quotes. space 
>> .out.expected
>> checking 142-space-and-quotes.2" space ".out.expected
>> Skipping exception test due to test configuration
>> checking 142-space-and-quotes.3' space '.out.expected
>> Skipping exception test due to test configuration
>> checking 142-space-and-quotes.out.expected
>> Skipping exception test due to test configuration
>> checking 142-space-and-quotes. space .out.expected
>> Skipping exception test due to test configuration
>> Test passed at Thu Apr 23 12:06:35 CDT 2009
>> ----------===========================----------
>> All language behaviour tests passed
>>
>>
>>
>> Zhao Zhang wrote:
>>> Hi, Ben
>>>
>>> Ben Clifford wrote:
>>>> On Thu, 23 Apr 2009, Zhao Zhang wrote:
>>>>
>>>>  
>>>>> Error 1: This is related to CI network setting,
>>>>> /etc/grid-security/hostcert.pem. Could anyone help on this? Who 
>>>>> should I
>>>>> contact?
>>>>>     
>>>>
>>>> fletch is broken. But try changing those sites files to use 
>>>> gwynn.bsd.uchicago.edu instead.
>>>>
>>>>  
>>>>> Error 2: My certificate is not enabled on teraport, As Mike and I 
>>>>> talked last
>>>>> night, "certificate revocation list" on CI network is misconfigured.
>>>>>     
>>>>
>>>> This looks more like a permissions problem - the directory being 
>>>> used in the sites.xml file for that test does not exist and you do 
>>>> not have permission to create it.
>>>>
>>>> In r2874 I have changes tests/sites/coaster/teraport-gt2-gt2-pbs.xml 
>>>> to use a different path that should work for you now.
>>>>   
>>> I tried this out, it failed, then I increased the wall-time to 15 
>>> minutes in the coaster/teraport-gt2-gt2-pbs.xml  file.
>>> And I am waiting now.
>>>
>>> zhao
>>>
>>> [zzhang at communicado sites]$ ./run-site coaster/teraport-gt2-gt2-pbs.xml
>>> testing site configuration: coaster/teraport-gt2-gt2-pbs.xml
>>> Removing files from previous runs
>>> Running test 061-cattwo at Thu Apr 23 11:12:09 CDT 2009
>>> Swift 0.9rc2 swift-r2860 cog-r2388
>>>
>>> RunID: 20090423-1112-6jqlxfcf
>>> Progress:
>>> Progress:  Stage in:1
>>> Progress:  Submitted:1
>>> Failed to transfer wrapper log from 
>>> 061-cattwo-20090423-1112-6jqlxfcf/info/q on teraport
>>> Failed to transfer wrapper log from 
>>> 061-cattwo-20090423-1112-6jqlxfcf/info/s on teraport
>>> Progress:  Stage in:1
>>> Failed to transfer wrapper log from 
>>> 061-cattwo-20090423-1112-6jqlxfcf/info/u on teraport
>>> Progress:  Failed:1
>>> Execution failed:
>>>        Exception in cat:
>>> Arguments: [061-cattwo.1.in, 061-cattwo.2.in]
>>> Host: teraport
>>> Directory: 061-cattwo-20090423-1112-6jqlxfcf/jobs/u/cat-umlmrs9j
>>> stderr.txt:
>>>
>>> stdout.txt:
>>>
>>> ----
>>>
>>> Caused by:
>>>        Job cannot be run with the given max walltime worker 
>>> constraint (task: 600, maxwalltime: 300s)
>>> Cleaning up...
>>> Shutting down service at https://128.135.125.118:58204
>>> Got channel MetaChannel: 1297642 -> GSSSChannel-null(1)
>>> - Done
>>> SWIFT RETURN CODE NON-ZERO - test 061-cattwo
>>>
>>>>  
>>>>> Error 3 & Error 4: I am not active on tgncsa site. Mike said he 
>>>>> needed to add
>>>>> me to another group.
>>>>>     
>>>>
>>>> yes.
>>>>
>>>> Do you have the list from the end of your test run about which sites 
>>>> worked and which did not?
>>>>
>>>>   
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list