[petsc-dev] speeding up testing by not always downloading external packages?
Satish Balay
balay at mcs.anl.gov
Sun Oct 11 13:38:36 CDT 2020
Well I don't think the download time is significant [for all the
builds at ANL] - as compared to the build times.
For ex: most of the time - petsc-pkg-hash gets reused [and this saves
on both downloads and builds] - such builds take about 2h. But when
packages have to be rebuilt - it can take 2:45 to 3h [so download part
must be pretty small]
But yeah - its wasted bandwidth - and not tolerant to network
disruptions.
And the other issue: might help with CI on low-bandwidth locations
[say run a CI instance at my house on a spare laptop]
But yes - this requires infrastructure. The way I look at it is - we
need a "local mirror" or "cache" infrastructure.
i.e keep the cache part separate from the build part [and not intertwine them]
Spack does stuff in this direction [and also has remote cache as one
of the 100 remote sites from where the packages can downloaded can be
down - but its not tolerant to certain changes - so I have to
periodically clean it - to have confidence in my build].
Note: If there is a git repo locally cached (and mirrored) - we don't
have to deal with shallow clones.
Might have a bigger impact if we can improve petsc-pkg-hash
infrastructure to avoid rebuilds in more cases. [i.e make it more
tolerant to configure changes - but its not clear to me - which
changes wont require rebuilds]
Satish
On Sun, 11 Oct 2020, Barry Smith wrote:
>
> Satish,
>
> Do you think the time to download all the external packages for each job is significant?
>
> Would using super shallow clones on the external packages help much in time? Maybe we should to them anyways to stop wasting bandwidth?
> Currently we do full clones? but we don't need the huge histories.
>
> A much more elaborate way to save more time
>
> On each test machine have repositories of all the external packages
>
> For each job,
>
> do pull in all these repositories from remote that job depends on (usually this will get nothing so take no time)
>
> For each package either
>
> - build in a unique build directory of the repository directory directly (for CMAKE and packages that support out of base directory builds)
>
> - make a local shallow clone of the local copy of the repository to externalpackages for the rest and do those builds there
>
> The average cost of this will just some shallow local clones instead of copying over from remote machines.
> The PETSc test directories can still be completely cleaned out for each job so Satish need not worry about testing with dirty directories.
>
> This requires a bit of infrastructure, if it saves a minute it is not worth it, but if it cuts the pipeline time from 180 minutes to 150 maybe?
> Probably not worth it. Could also be done just for a couple of the most external package intense jobs.
>
> Barry
>
>
>
>
>
>
>
>
>
More information about the petsc-dev
mailing list