Is it possible to store only a checksum of a large file in git?

北战南征 提交于 2019-12-06 14:03:02

I wrote a script that does this sort of thing. You put file patterns in the .gitattributes file for large media that you don't want going in your git repo and it can store them on S3 instead. It's just a starting point, but I think it's usable if you're interested.

http://github.com/schacon/git-media

Maybe that will help you, or at least show you how it could be done and you can customize it for your specific needs.

Jakub Narębski

In the upcoming release of git there would be 'refs/replace/' mechanism, which I think could be adapted for such purpose (assuming that the number of such large-media files and the number of its version isn't very large.)

In the slim fork of your project you would have (like Seth wrote) 'stub' files in place of your large media files, which as contents would have SHA-1 of a blob of large file (from "git hash-object -t blob <filename>").

Then in full fork of your project you would use "refs/replace/" mechanism to replace those 'stub' files by true contents (using git replace). Some hooks would be required to keep SHA-1 in 'stub' files in sync with actual large-media files.

Then if you want full clone, you fetch also from "refs/replace/" namespace; if you want slim clone, you don't fetch "refs/replace/".

Note: I haven't actually tested such setup; also this isn't yet available in git, unless you run 'master'

How about storing the hashes in a text file, then giving the text file to git? Then you could write a hook that compared hashes, so every time you checked in or checked out, you could be notified of what was missing / different.

Not exactly what you want, and you would still have to maintain the text file manually.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!