How does SHA generate unique codes for big files in git

后端未结

关注

 3  1932

Happy的楠姐 2020-12-22 15:26

Using Git I don\'t understand how using SHA you can generate just a 40 hexadecimal digit code that can then be mapped to any file which could be hundreds of lines long.

3条回答

小蘑菇 (楼主)

2020-12-22 16:19
A SHA-1 hash is 160 bits long. That gives you 2¹⁶⁰, or exactly
```
1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976
```
possible hashes.

Assuming hash values are more or less unpredictable, the odds of two files accidentally having the same hash are infinitesimal to the point that it's just not worth worrying about it.

Quoting from Scott Chacon's book "Pro Git":

However, you should be aware of how ridiculously unlikely this scenario is. The SHA–1 digest is 20 bytes or 160 bits. The number of randomly hashed objects needed to ensure a 50% probability of a single collision is about 2⁸⁰.

...

Here’s an example to give you an idea of what it would take to get a SHA–1 collision. If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (1 million Git objects) and pushing it into one enormous Git repository, it would take 5 years until that repository contained enough objects to have a 50% probability of a single SHA–1 object collision. A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night.

It's true that there must be two 21-byte files that have the same SHA-1 hash (since there are 2¹⁶⁸ such files and only 2¹⁶⁰ possible SHA-1 hashes). ~~No such files have ever been discovered.~~

UPDATE : As of February 2017, two distinct PDF files with identical SHA-1 checksums have been generated, using a technique that's more than 100,000 times as fast as a brute force attack. Details here: https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Linux Torvalds (the author of Git) has posted a (preliminary) response here: http://marc.info/?l=git&m=148787047422954
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...