[petsc-dev] Thoughts on pushing current CI infrastructure to the next level

Balay, Satish balay at mcs.anl.gov
Thu Apr 25 13:05:13 CDT 2019

And some additional thoughts about jenkins: (issues I don't know how to deal with)

- have a single view of tests per PR. [jenkins provides a machine view - or machine/test view or individual test view]

- ability to restart all the tests for a given PR - or block/disable all tests for a given PR.


On Thu, 25 Apr 2019, Balay, Satish via petsc-dev wrote:

> On Thu, 25 Apr 2019, Karl Rupp via petsc-dev wrote:
> > Dear PETSc developers,
> > 
> > the current Jenkins server went live last summer. Since then, the stability of
> > master and next has indeed improved. Who would have thought three years ago
> > that `next` is almost as stable as `master`?
> > 
> > However, over the weeks and months some weaknesses of our current continuous
> > integration infrastructure became apparent:
> > 
> > 1.) Still no Jenkins tests on Windows, because the remote execution of a Java
> > application has some issues with Cygwin (which we require for PETSc).
> [on discussions with Jed] - it appears that git-lab ci does not use
> java. Also it mentioned 'ssh' in the list of 'executors' - so that
> might work similar to our current windows setup.
> > 
> > 2.) Jenkins workers every once in a while hang on the target machine (this has
> > been independently observed in a different setting by Jed as well).
> > 
> Yes - this is bad. So one criteria wrt choosing alternatives: how does
> one debug problems with the CI tool.
> > 3.) Nonscalability of the current setup: The Jenkins server clones a separate
> > copy of the repository for each pull request and each test arch. Each clone of
> > the PETSc repository is 300 MB, so if we aim at 40 different arches (i.e. the
> > current coverage of the nightly tests) to test for each pull request, 300 MB *
> > 40 = 12 GB of memory is required *for each pull request* on the Jenkins
> > master.
> If there is a way to manage how clones are used in the jenkins process
> - I'm guessing this requirement can go down considerably wit local git
> clones [which would use hard links to save space].
> > 
> > 4.) Pull requests from external repositories in Bitbucket are currently tested
> > by Jenkins, but the results are not visible on the pull requests page. This
> > might be a Bitbucket issue rather than a Jenkins issue; and yet, it impedes
> > our work flow.
> I suspect this is because we don't have write access to the
> forks. Previously all forks gave write access to the petsc
> group. Don't know if that is setup somewhere - and can be modified [to
> give write access from forks to jenkins]
> Satish
> > 5.) Adding additional workers requires significant configuration effort on the
> > Jenkins master and is far from hassle-free. For example, it is currently
> > impractical to add my office machine to the pool of workers, even though this
> > machine is 99% idle.
> > 
> > With some effort we can certainly address 1.) and to some extent 3.), probably
> > 4.) as well, but I don't know how to solve 2.) and 5.) with Jenkins. Given
> > that a significant effort is required for 1.), 3.) and 4.) anyway, I'm
> > starting to get more and more comfortable with the idea of rolling our own CI
> > infrastructure (which has been suggested in some of Barry's snarky remarks
> > already ;-) ). Small Python scripts for executing the tests and pushing
> > results to Bitbucket as well as a central result storage can replicate our
> > existing setup with a few lines of codes, while being much more flexible.
> > 
> > What do other PETSc developers think about CI infrastructure? Maybe
> > suggestions other than Jenkins?
> We would also have to think in terms of multiple levels of CI/testing.
> For ex: we currently have some setup with travis-ci from github, and pipelines from bitbucket.
> However we are not using them in the PullRequest workflow.
> And then -the ECP ci resources -if we are to utilize - would also not
> be useable in the PR workflow - but might be useful for performance regression
> testing and large scale testing [if we can setup the test suite for it]. And
> this appears to be via git-lab ci
> Satish

More information about the petsc-dev mailing list