Git is moving to new hashing algorithm SHA-256 but why git community settled on SHA‑256

醉酒当歌 提交于 2020-02-20 11:17:25

问题


I just learned from this HN-post that git is moving to new hashing algorithm ( from SHA-1 to SHA-256 )

I wanted to know what makes SHA-256 best fit for git's use case. Is there any/many strong technical reason or is it possible that SHA-256 popularity is a strong factor ? ( I am making a guess ) Looking at https://en.wikipedia.org/wiki/Comparison_of_cryptographic_hash_functions page I see thee are many modern and older alternatives present. some of them are more ( almost same if not more ) performant and stronger than SHA-256 ( example https://crypto.stackexchange.com/q/26336 )


回答1:


I have presented that move in "Why doesn't Git use more modern SHA?" in Aug. 2018

The reasons were discussed here by Brian M. Carlson:

I've implemented and tested the following algorithms, all of which are 256-bit (in alphabetical order):

  • BLAKE2b (libb2)
  • BLAKE2bp (libb2)
  • KangarooTwelve (imported from the Keccak Code Package)
  • SHA-256 (OpenSSL)
  • SHA-512/256 (OpenSSL)
  • SHA3-256 (OpenSSL)
  • SHAKE128 (OpenSSL)

I also rejected some other candidates.
I couldn't find any reference or implementation of SHA256×16, so I didn't implement it.
I didn't consider SHAKE256 because it is nearly identical to SHA3-256 in almost all characteristics (including performance).

SHA-256 and SHA-512/256

These are the 32-bit and 64-bit SHA-2 algorithms that are 256 bits in size.

I noted the following benefits:

  • Both algorithms are well known and heavily analyzed.
  • Both algorithms provide 256-bit preimage resistance.

Summary

The algorithms with the greatest implementation availability are SHA-256, SHA3-256, BLAKE2b, and SHAKE128.

In terms of command-line availability, BLAKE2b, SHA-256, SHA-512/256, and SHA3-256 should be available in the near future on a reasonably small Debian, Ubuntu, or Fedora install.

As far as security, the most conservative choices appear to be SHA-256, SHA-512/256, and SHA3-256.

The performance winners are BLAKE2b unaccelerated and SHA-256 accelerated.

The suggested conclusion was based on:

Popularity

Other things being equal we should be biased towards whatever's in the widest use & recommended for new projects.

Hardware acceleration

The only widely deployed HW acceleration is for the SHA-1 and SHA-256 from the SHA-2 family, but notably nothing from the newer SHA-3 family (released in 2015).

Age

Similar to "popularity" it seems better to bias things towards a hash that's been out there for a while, i.e. it would be too early to pick SHA-3.

The hash transitioning plan, once implemented, also makes it easier to switch to something else in the future, so we shouldn't be in a rush to pick some newer hash because we'll need to keep it forever, we can always do another transition in another 10-15 years.

Result: commit 0ed8d8d, Git v2.19.0-rc0, Aug 4, 2018.

SHA-256 has a number of advantages:

  • It has been around for a while, is widely used, and is supported by just about every single crypto library (OpenSSL, mbedTLS, CryptoNG, SecureTransport, etc).

  • When you compare against SHA1DC, most vectorized SHA-256 implementations are indeed faster, even without acceleration.

  • If we're doing signatures with OpenPGP (or even, I suppose, CMS), we're going to be using SHA-2, so it doesn't make sense to have our security depend on two separate algorithms when either one of them alone could break the security when we could just depend on one.

So SHA-256 it is.

The idea remains: Any notion of SHA1 is being removed from Git codebase and replaced by a generic "hash" variable.
Tomorrow, that hash will be SHA2, but the code will support other hashes in the future.

As Linus Torvalds delicately puts it (emphasis mine):

Honestly, the number of particles in the observable universe is on the order of 2**256. It's a really really big number.

Don't make the code base more complex than it needs to be.
Make a informed technical decision, and say "256 bits is a lot".

The difference between engineering and theory is that engineering makes trade-offs.
Good software is well engineered, not theorized
.

Also, I would suggest that git default to "abbrev-commit=40", so that nobody actually sees the new bits by default.
So the perl scripts etc that use "[0-9a-f]{40}" as a hash pattern would just silently continue to work.

Because backwards compatibility is important (*)

(*) And 2**160 is still a big big number, and hasn't really been a practical problem, and SHA1DC is likely a good hash for the next decade or longer.

(SHA1DC, for "Detecting(?) Collision", was discussed in early 2017, after the collision attack shattered.io instance: see commit 28dc98e, Git v2.13.0-rc0, March 2017, from Jeff King, and "Hash collision in git")


See more in Documentation/technical/hash-function-transition.txt

The transition to SHA-256 can be done one local repository at a time.

a. Requiring no action by any other party.
b. A SHA-256 repository can communicate with SHA-1 Git servers (push/fetch).
c. Users can use SHA-1 and SHA-256 identifiers for objects interchangeably (see "Object names on the command line", below).
d. New signed objects make use of a stronger hash function than SHA-1 for their security guarantees.



来源:https://stackoverflow.com/questions/60087759/git-is-moving-to-new-hashing-algorithm-sha-256-but-why-git-community-settled-on

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!