git is very very slow when tracking large binary files

后端 未结 10 1921
情话喂你
情话喂你 2020-12-07 08:50

My project is six months old and git is very very slow. We track around 30 files which are of size 5 MB to 50 MB. Those are binary files and we keep them in git. I believe t

10条回答
  •  感动是毒
    2020-12-07 09:39

    There is nothing specific about binary files and the way git is handling them. When you add a file to a git repository, a header is added and the file is compressed with zlib and renamed after the SHA1 hash. This is exactly the same regardless of file type. There is nothing in zlib compression that makes it problematic for binary files.

    But at some points (pushing, gc) Git start to look at the possibility to delta compress content. If git find files that are similar (filename etc) it is putting them in RAM and starting to compress them together. If you have 100 files and each of them arr say 50Mb it will try to put 5GB in the memory at the same time. To this you have to add some more to make things work. You computer may not have this amount of RAM and it starts to swap. The process takes time.

    You can limit the depth of the delta compression so that the process doesn't use that much memory but the result is less efficient compression. (core.bigFileThreshold, delta attribute, pack.window, pack.depth, pack.windowMemory etc)

    So there are lots of thinks you can do to make git work very well with large files.

提交回复
热议问题