[petsc-dev] [mpich-discuss] MPICH migration to git

Jed Brown jedbrown at mcs.anl.gov
Mon Jan 21 11:42:12 CST 2013


On Mon, Jan 21, 2013 at 11:18 AM, Sean Farley <sean.michael.farley at gmail.com
> wrote:

> On Mon, Jan 21, 2013 at 11:03 AM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> >
> > On Mon, Jan 21, 2013 at 10:53 AM, Sean Farley
> > <sean.michael.farley at gmail.com> wrote:
> >>
> >> Well … did you try this with the equivalent mercurial feature:
> >> largefiles?
> >
> >
> > Nope, feel free. Most of the speedup is independent of the large files
> > (which only change the git repo size from 78MB to 50MB).
>
> Righto.


Here's the clone without any "git-fat" business (18 seconds, versus 12
seconds with git-fat):

$ time git clone git at bitbucket.org:jedbrown/petsc-git
Cloning into 'petsc-git'...
remote: Counting objects: 300368, done.
remote: Compressing objects: 100% (66014/66014), done.
remote: Total 300368 (delta 233578), reused 300368 (delta 233578)
Receiving objects: 100% (300368/300368), 68.18 MiB | 10.22 MiB/s, done.
Resolving deltas: 100% (233578/233578), done.
18.067 real   16.042 user   2.080 sys   100.30 cpu
$ du -hs petsc-git/.git
77M     petsc-git/.git


Does hg largefiles have a way to version the patterns that it should match?
It looks like the mailing list advice is to have a script that you ask
users to run that will modify their .hg/hgrc. Is there a way to make
lfconvert use a list of files rather than a single threshold size?


> >> Also, what files did
> >> you deem were "fat"
> >
> > A smattering of powerpoint slides, pdfs, random binaries, and a few very
> > large log files. Note that this was a performance experiment and don't
> care
> > about which files. In practice, I'd suggest managing fewer (or even
> none).
>
> Huh? Do you mean not even worrying about separating binary files?


PETSc does not actually have many large files. There were a few binaries
accidentally checked in, but it just doesn't make that big of a difference.
If we switched to git, we might just delete that very worst offenders. The
second to last column has the file sizes for everything over 1MB:

petsc-git$ git fat find 1000000
src/dm/mesh/examples/tutorials/ex_coarsen_3                  filter=fat
-text #    9415934 3
src/dm/mesh/examples/tutorials/ex_interpolate_1              filter=fat
-text #    9038252 1
zope/var/data.fs.hg                                          filter=fat
-text #    8615555 13
src/snes/examples/tutorials/phi                              filter=fat
-text #    8388616 1
zope/var/hgdata.fs                                           filter=fat
-text #    6842744 1
src/dm/impls/mesh/examples/tutorials/illinois/il.ps          filter=fat
-text #    6578178 1
src/dm/mesh/examples/tutorials/illinois/il.ps                filter=fat
-text #    6578178 1
zope/var/Data.fs                                             filter=fat
-text #    5501166 11
src/contrib/blopex/driver_fiedler/L-matrix.petsc             filter=fat
-text #    4182032 1
src/docs/website/tops/images/topsanimation.gif               filter=fat
-text #    3137151 1
src/tops/images/topsanimation.gif                            filter=fat
-text #    3137151 1
docs/website/docs/tutorials/nersc01/nersc01.pdf              filter=fat
-text #    1251497 1
docs/webpage/docs/tutorials/nersc01/nersc01.pdf              filter=fat
-text #    1251497 1
src/docs/website/documentation/tutorials/nersc01/nersc01.pdf filter=fat
-text #    1251497 1
docs/website/documentation/tutorials/nersc01/nersc01.pdf     filter=fat
-text #    1251497 1
src/ts/examples/tutorials/output/ex12_1.out                  filter=fat
-text #    1219820 1
src/contrib/blopex/driver_fiedler/DL-matrix.petsc            filter=fat
-text #    1048592 1

FWIW, I ran hg lfconvert --size 1 (treat all files >1MB as "large") and
.hg/store is still 150MB (though .hg/largefiles ballooned to 220MB).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130121/18d42989/attachment.html>


More information about the petsc-dev mailing list