[Swift-devel] feature request

Mihael Hategan hategan at mcs.anl.gov
Fri Apr 24 15:08:27 CDT 2009


On Fri, 2009-04-24 at 14:30 -0500, Michael Wilde wrote:
> Great, Zhao.
> 
> What's next?
> 
> Testing of the new condor provider features is important.

The block allocator and coasters+condor-g are mutually useless.

So if we were otherwise planning to stabilize both at the same time,
then it may probably be better to focus on just one.

> 
> For failures, it would be good to see if you can go further in at least 
> lifting out the errors so that developers can either tell you if there 
> was an error in the testing (which you should fix) or an error in the 
> code (which they should fix, and you can help get them the info they 
> need, and identify faster which errors might need more immediate attention).
> 
> Can you suggest a methodical approach to testing, in terms of:
> 
> - what tests you need to and plan to run on what systems?
> - how the reports are organized
> - how errors are listed and diagnosed
> 
> I want this to be a more interactive process between you and the 
> developers, not just "it broke, see dir X"
> 
> Thanks,
> 
> Mike
> 
> 
> On 4/24/09 12:52 PM, Zhao Zhang wrote:
> > Hi, All
> > 
> > As I got mapped on gwynn, I redid the tests. The results are
> > 
> > All language behaviour tests passed
> > These sites failed: coaster/tgncsa-hg-coaster-pbs-gram2.xml 
> > coaster/tgncsa-hg-coaster-pbs-gram4.xml
> > These sites worked: coaster/coaster-local.xml 
> > coaster/gwynn-coaster-gram2-gram2-condor.xml 
> > coaster/gwynn-coaster-gram2-gram2-fork.xml 
> > coaster/renci-engage-coaster.xml coaster/teraport-gt2-gt2-pbs.xml 
> > coaster/uj-pbs-gram2.xml
> > 
> > Logs could be found at 
> > /home/zzhang/swift_coaster/cog/modules/swift/tests/sites/log_all on CI 
> > network.
> > 
> > zhao
> > 
> > Zhao Zhang wrote:
> >> Hi, again
> >>
> >> the test on teraport is successful, here is the log
> >>
> >> zhao
> >>
> >> testing site configuration: coaster/teraport-gt2-gt2-pbs.xml
> >> Removing files from previous runs
> >> Running test 061-cattwo at Thu Apr 23 11:27:19 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1127-aluxx4m9
> >> Progress:
> >> Progress:  Stage in:1
> >> Progress:  Submitted:1
> >> Progress:  Submitted:1
> >> ...
> >> Progress:  Active:1
> >> Progress:  Finished successfully:1
> >> Final status:  Finished successfully:1
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:57080
> >> Got channel MetaChannel: 2129305 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 061-cattwo.out.expected
> >> checking 061-cattwo.out.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 11:57:39 CDT 2009
> >> ----------===========================----------
> >> Running test 130-fmri at Thu Apr 23 11:57:39 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1157-r8sarc77
> >> Progress:
> >> Progress:  Selecting site:2  Initializing site shared directory:1  
> >> Stage in:1
> >> Progress:  Selecting site:2  Stage in:1  Submitting:1
> >> Progress:  Selecting site:2  Submitting:1  Submitted:1
> >> ...
> >> Progress:  Selecting site:2  Submitted:2
> >> Progress:  Selecting site:2  Submitted:2
> >> Progress:  Selecting site:2  Submitted:2
> >> Progress:  Selecting site:2  Submitted:1  Active:1
> >> Progress:  Selecting site:2  Active:1  Stage out:1
> >> Progress:  Selecting site:1  Stage in:1  Stage out:1  Finished 
> >> successfully:1
> >> Progress:  Submitted:1  Stage out:1  Finished successfully:2
> >> Progress:  Active:1  Finished successfully:4
> >> Progress:  Submitting:2  Submitted:1  Finished successfully:5
> >> Progress:  Active:2  Stage out:1  Finished successfully:5
> >> Progress:  Submitted:1  Stage out:2  Finished successfully:8
> >> Final status:  Finished successfully:11
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:52773
> >> Got channel MetaChannel: 28761475 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 130-fmri.0000.jpeg.expected 130-fmri.0001.jpeg.expected 
> >> 130-fmri.0002.jpeg.expected
> >> checking 130-fmri.0000.jpeg.expected
> >> Skipping exception test due to test configuration
> >> checking 130-fmri.0001.jpeg.expected
> >> Skipping exception test due to test configuration
> >> checking 130-fmri.0002.jpeg.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:04:47 CDT 2009
> >> ----------===========================----------
> >> Running test 103-quote at Thu Apr 23 12:04:47 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1204-sjzpkfd3
> >> Progress:
> >> Progress:  Stage in:1
> >> Progress:  Submitted:1
> >> Progress:  Active:1
> >> Progress:  Finished successfully:1
> >> Final status:  Finished successfully:1
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:40813
> >> Got channel MetaChannel: 28500325 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 103-quote.out.expected
> >> checking 103-quote.out.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:05:05 CDT 2009
> >> ----------===========================----------
> >> Running test 1032-singlequote at Thu Apr 23 12:05:05 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1205-x2d55af3
> >> Progress:
> >> Progress:  Stage in:1
> >> Progress:  Submitted:1
> >> Progress:  Active:1
> >> Progress:  Finished successfully:1
> >> Final status:  Finished successfully:1
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:44126
> >> Got channel MetaChannel: 18100302 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 1032-singlequote.out.expected
> >> checking 1032-singlequote.out.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:05:22 CDT 2009
> >> ----------===========================----------
> >> Running test 1031-quote at Thu Apr 23 12:05:22 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1205-5aa1ko4e
> >> Progress:
> >> Progress:  Stage in:1
> >> Progress:  Submitted:1
> >> Progress:  Active:1
> >> Final status:  Finished successfully:1
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:43759
> >> Got channel MetaChannel: 19002607 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 1031-quote.*.expected
> >> No expected output files specified for this test case - not checking 
> >> output.
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:05:38 CDT 2009
> >> ----------===========================----------
> >> Running test 1033-singlequote at Thu Apr 23 12:05:38 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1205-8nopyujc
> >> Progress:
> >> Progress:  Stage in:1
> >> Progress:  Submitted:1
> >> Progress:  Active:1
> >> Progress:  Finished successfully:1
> >> Final status:  Finished successfully:1
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:39924
> >> Got channel MetaChannel: 31196317 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 1033-singlequote.out.expected
> >> checking 1033-singlequote.out.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:05:56 CDT 2009
> >> ----------===========================----------
> >> Running test 141-space-in-filename at Thu Apr 23 12:05:56 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1205-aalqz1c4
> >> Progress:
> >> Progress:  Stage in:1
> >> Progress:  Submitted:1
> >> Progress:  Active:1
> >> Progress:  Finished successfully:1
> >> Final status:  Finished successfully:1
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:60177
> >> Got channel MetaChannel: 4728458 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 141-space-in-filename.space here.out.expected
> >> checking 141-space-in-filename.space here.out.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:06:15 CDT 2009
> >> ----------===========================----------
> >> Running test 142-space-and-quotes at Thu Apr 23 12:06:15 CDT 2009
> >> Swift 0.9rc2 swift-r2860 cog-r2388
> >>
> >> RunID: 20090423-1206-8617gag1
> >> Progress:
> >> Progress:  Selecting site:2  Initializing site shared directory:1  
> >> Stage in:1
> >> Progress:  Selecting site:2  Submitting:1  Submitted:1
> >> Progress:  Selecting site:2  Submitted:1  Active:1
> >> Progress:  Selecting site:2  Active:1  Finished successfully:1
> >> Progress:  Stage out:1  Finished successfully:3
> >> Final status:  Finished successfully:4
> >> Cleaning up...
> >> Shutting down service at https://128.135.125.118:57945
> >> Got channel MetaChannel: 16387060 -> GSSSChannel-null(1)
> >> - Done
> >> expecting 142-space-and-quotes.2" space ".out.expected 
> >> 142-space-and-quotes.3' space '.out.expected 
> >> 142-space-and-quotes.out.expected 142-space-and-quotes. space 
> >> .out.expected
> >> checking 142-space-and-quotes.2" space ".out.expected
> >> Skipping exception test due to test configuration
> >> checking 142-space-and-quotes.3' space '.out.expected
> >> Skipping exception test due to test configuration
> >> checking 142-space-and-quotes.out.expected
> >> Skipping exception test due to test configuration
> >> checking 142-space-and-quotes. space .out.expected
> >> Skipping exception test due to test configuration
> >> Test passed at Thu Apr 23 12:06:35 CDT 2009
> >> ----------===========================----------
> >> All language behaviour tests passed
> >>
> >>
> >>
> >> Zhao Zhang wrote:
> >>> Hi, Ben
> >>>
> >>> Ben Clifford wrote:
> >>>> On Thu, 23 Apr 2009, Zhao Zhang wrote:
> >>>>
> >>>>  
> >>>>> Error 1: This is related to CI network setting,
> >>>>> /etc/grid-security/hostcert.pem. Could anyone help on this? Who 
> >>>>> should I
> >>>>> contact?
> >>>>>     
> >>>>
> >>>> fletch is broken. But try changing those sites files to use 
> >>>> gwynn.bsd.uchicago.edu instead.
> >>>>
> >>>>  
> >>>>> Error 2: My certificate is not enabled on teraport, As Mike and I 
> >>>>> talked last
> >>>>> night, "certificate revocation list" on CI network is misconfigured.
> >>>>>     
> >>>>
> >>>> This looks more like a permissions problem - the directory being 
> >>>> used in the sites.xml file for that test does not exist and you do 
> >>>> not have permission to create it.
> >>>>
> >>>> In r2874 I have changes tests/sites/coaster/teraport-gt2-gt2-pbs.xml 
> >>>> to use a different path that should work for you now.
> >>>>   
> >>> I tried this out, it failed, then I increased the wall-time to 15 
> >>> minutes in the coaster/teraport-gt2-gt2-pbs.xml  file.
> >>> And I am waiting now.
> >>>
> >>> zhao
> >>>
> >>> [zzhang at communicado sites]$ ./run-site coaster/teraport-gt2-gt2-pbs.xml
> >>> testing site configuration: coaster/teraport-gt2-gt2-pbs.xml
> >>> Removing files from previous runs
> >>> Running test 061-cattwo at Thu Apr 23 11:12:09 CDT 2009
> >>> Swift 0.9rc2 swift-r2860 cog-r2388
> >>>
> >>> RunID: 20090423-1112-6jqlxfcf
> >>> Progress:
> >>> Progress:  Stage in:1
> >>> Progress:  Submitted:1
> >>> Failed to transfer wrapper log from 
> >>> 061-cattwo-20090423-1112-6jqlxfcf/info/q on teraport
> >>> Failed to transfer wrapper log from 
> >>> 061-cattwo-20090423-1112-6jqlxfcf/info/s on teraport
> >>> Progress:  Stage in:1
> >>> Failed to transfer wrapper log from 
> >>> 061-cattwo-20090423-1112-6jqlxfcf/info/u on teraport
> >>> Progress:  Failed:1
> >>> Execution failed:
> >>>        Exception in cat:
> >>> Arguments: [061-cattwo.1.in, 061-cattwo.2.in]
> >>> Host: teraport
> >>> Directory: 061-cattwo-20090423-1112-6jqlxfcf/jobs/u/cat-umlmrs9j
> >>> stderr.txt:
> >>>
> >>> stdout.txt:
> >>>
> >>> ----
> >>>
> >>> Caused by:
> >>>        Job cannot be run with the given max walltime worker 
> >>> constraint (task: 600, maxwalltime: 300s)
> >>> Cleaning up...
> >>> Shutting down service at https://128.135.125.118:58204
> >>> Got channel MetaChannel: 1297642 -> GSSSChannel-null(1)
> >>> - Done
> >>> SWIFT RETURN CODE NON-ZERO - test 061-cattwo
> >>>
> >>>>  
> >>>>> Error 3 & Error 4: I am not active on tgncsa site. Mike said he 
> >>>>> needed to add
> >>>>> me to another group.
> >>>>>     
> >>>>
> >>>> yes.
> >>>>
> >>>> Do you have the list from the end of your test run about which sites 
> >>>> worked and which did not?
> >>>>
> >>>>   
> >>> _______________________________________________
> >>> Swift-devel mailing list
> >>> Swift-devel at ci.uchicago.edu
> >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >>>
> >> _______________________________________________
> >> Swift-devel mailing list
> >> Swift-devel at ci.uchicago.edu
> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >>
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list