[petsc-dev] Fwd: Nightly tests quick summary page

Thu Jan 24 12:48:52 CST 2013

  We do have the capability of running gcov on a bunch of systems and merging the results and marking the source that is never tested 
From  http://www.mcs.anl.gov/petsc/developers/index.html there is the sentence The coverage(what lines of source code are tested in the nightly builds) can be found at http:/www.mcs.anl.gov/petsc/petsc-dev/index_gcov1.html  sadly the link is broken.

   Barry

On Jan 24, 2013, at 10:22 AM, Karl Rupp <rupp at mcs.anl.gov> wrote:

> Hi,
> 
> On 01/24/2013 09:47 AM, Jed Brown wrote:
>> 
>> On Thu, Jan 24, 2013 at 9:39 AM, Karl Rupp <rupp at mcs.anl.gov
>> <mailto:rupp at mcs.anl.gov>> wrote:
>> 
>>    Testing for the same number of iterations is - as you mentioned - a
>>    terrible metric. I see this regularly on GPUs, where rounding modes
>>    differ slightly from CPUs. Running a fixed (low) number of
>>    iterations is certainly the better choice here, provided that the
>>    systems we use for the tests are neither too ill-conditioned nor too
>>    well-behaved so that we can eventually reuse the tests for some
>>    preconditioners.
>> 
>> 
>> That's something that certainly makes sense for tests of functionality,
>> but not for examples/tutorials that new users should encounter, lest
>> they get the impression that they should use such options.
> 
> I consider it good practice to keep tests and tutorials separate, just as it is suggested by our folder hierarchy (even though there may be no strict adherence to it right now). Guiding the user towards using a certain functionality is so fundamentally different from testing a certain functionality for various inputs and/or corner cases.
> 
> For example, in ViennaCL I have a test that checks the operation
> A = B * C
> for dense matrices A, B, and C. In a tutorial, I only show about three uses of this functionality. In the test, however, all combinations of transpose, submatrices, memory-layouts, etc. are tested, leading to about 200 variations of this operation. I can't think of any reasonable way of merging the two - so I simply optimize them for their particular purpose. So, we can only check the tutorials on whether they execute properly (i.e. input files found, convergence achieved at all, etc.), but they are inherently unsuitable for thorough testing.
> 
> 
>> Do you have much experience with code coverage tools? It would be very
>> useful if we could automatically identify which tests were serving no
>> useful purpose. The amount of time taken by make alltests is currently
>> unreasonable, and though parallel testing will help, I suspect there are
>> also many tests that could be removed (and time-consuming tests that
>> could be made much faster without affecting their usefulness).
> 
> Similar to what Matt said, I tried gcov some long time ago when I was a programming-greenhorn. I wasn't satisfied with the results I got at that time, so I designed 'implementation-aware' tests instead which gave satisfactory results. Note to myself: Reevaluate gconv.
> 
> Best regards,
> Karli
>