问题
I just learned from this HN-post that git is moving to new hashing algorithm ( from SHA-1
to SHA-256
)
I wanted to know what makes SHA-256
best fit for git's use case.
Is there any/many strong technical reason or is it possible that SHA-256
popularity is a strong factor ? ( I am making a guess )
Looking at https://en.wikipedia.org/wiki/Comparison_of_cryptographic_hash_functions page I see thee are many modern and older alternatives present. some of them are more ( almost same if not more ) performant and stronger than SHA-256
( example https://crypto.stackexchange.com/q/26336 )
回答1:
I have presented that move in "Why doesn't Git use more modern SHA?" in Aug. 2018
The reasons were discussed here by Brian M. Carlson:
I've implemented and tested the following algorithms, all of which are 256-bit (in alphabetical order):
- BLAKE2b (libb2)
- BLAKE2bp (libb2)
- KangarooTwelve (imported from the Keccak Code Package)
- SHA-256 (OpenSSL)
- SHA-512/256 (OpenSSL)
- SHA3-256 (OpenSSL)
- SHAKE128 (OpenSSL)
I also rejected some other candidates.
I couldn't find any reference or implementation of SHA256×16, so I didn't implement it.
I didn't consider SHAKE256 because it is nearly identical to SHA3-256 in almost all characteristics (including performance).SHA-256 and SHA-512/256
These are the 32-bit and 64-bit SHA-2 algorithms that are 256 bits in size.
I noted the following benefits:
- Both algorithms are well known and heavily analyzed.
- Both algorithms provide 256-bit preimage resistance.
Summary
The algorithms with the greatest implementation availability are SHA-256, SHA3-256, BLAKE2b, and SHAKE128.
In terms of command-line availability, BLAKE2b, SHA-256, SHA-512/256, and SHA3-256 should be available in the near future on a reasonably small Debian, Ubuntu, or Fedora install.
As far as security, the most conservative choices appear to be SHA-256, SHA-512/256, and SHA3-256.
The performance winners are BLAKE2b unaccelerated and SHA-256 accelerated.
The suggested conclusion was based on:
Popularity
Other things being equal we should be biased towards whatever's in the widest use & recommended for new projects.
Hardware acceleration
The only widely deployed HW acceleration is for the SHA-1 and SHA-256 from the SHA-2 family, but notably nothing from the newer SHA-3 family (released in 2015).
Age
Similar to "popularity" it seems better to bias things towards a hash that's been out there for a while, i.e. it would be too early to pick SHA-3.
The hash transitioning plan, once implemented, also makes it easier to switch to something else in the future, so we shouldn't be in a rush to pick some newer hash because we'll need to keep it forever, we can always do another transition in another 10-15 years.
Result: commit 0ed8d8d, Git v2.19.0-rc0, Aug 4, 2018.
SHA-256 has a number of advantages:
It has been around for a while, is widely used, and is supported by just about every single crypto library (OpenSSL, mbedTLS, CryptoNG, SecureTransport, etc).
When you compare against SHA1DC, most vectorized SHA-256 implementations are indeed faster, even without acceleration.
If we're doing signatures with OpenPGP (or even, I suppose, CMS), we're going to be using SHA-2, so it doesn't make sense to have our security depend on two separate algorithms when either one of them alone could break the security when we could just depend on one.
So SHA-256 it is.
The idea remains: Any notion of SHA1 is being removed from Git codebase and replaced by a generic "hash" variable.
Tomorrow, that hash will be SHA2, but the code will support other hashes in the future.
As Linus Torvalds delicately puts it (emphasis mine):
Honestly, the number of particles in the observable universe is on the order of 2**256. It's a really really big number.
Don't make the code base more complex than it needs to be.
Make a informed technical decision, and say "256 bits is a lot".The difference between engineering and theory is that engineering makes trade-offs.
Good software is well engineered, not theorized.Also, I would suggest that git default to "
abbrev-commit=40
", so that nobody actually sees the new bits by default.
So the perl scripts etc that use "[0-9a-f]{40}
" as a hash pattern would just silently continue to work.Because backwards compatibility is important (*)
(*) And 2**160 is still a big big number, and hasn't really been a practical problem, and SHA1DC is likely a good hash for the next decade or longer.
(SHA1DC, for "Detecting(?) Collision", was discussed in early 2017, after the collision attack shattered.io instance: see commit 28dc98e, Git v2.13.0-rc0, March 2017, from Jeff King, and "Hash collision in git")
See more in Documentation/technical/hash-function-transition.txt
The transition to SHA-256 can be done one local repository at a time.
a. Requiring no action by any other party.
b. A SHA-256 repository can communicate with SHA-1 Git servers (push/fetch).
c. Users can use SHA-1 and SHA-256 identifiers for objects interchangeably (see "Object names on the command line", below).
d. New signed objects make use of a stronger hash function than SHA-1 for their security guarantees.
来源:https://stackoverflow.com/questions/60087759/git-is-moving-to-new-hashing-algorithm-sha-256-but-why-git-community-settled-on