Hash Code and Checksum - what's the difference?

后端未结

关注

 13  1741

My understanding is that a hash code and checksum are similar things - a numeric value, computed for a block of data, that is relatively unique.

i.e. The pr

相关标签:

13条回答

轻奢々

2020-12-04 07:49

Although hashing and checksums are similar in that they both create a value based on the contents of a file, hashing is not the same as creating a checksum. A checksum is intended to verify (check) the integrity of data and identify data-transmission errors, while a hash is designed to create a unique digital fingerprint of the data.

Source: CompTIA ® Security+ Guide to Network Security Fundamentals - Fifth Edition - Mark Ciampa -Page 191

0 讨论(0)
发布评论:

提交评论
- 加载中...
别那么骄傲

2020-12-04 07:49

I tend to use the word checksum when referring to the code (numeric or otherwise) created for a file or piece of data that can be used to check that the file or data has not been corrupted. The most common usage I come across is to check that files sent across the network have not been altered (deliberately or otherwise).

0 讨论(0)
发布评论:

提交评论
- 加载中...
太阳男子

2020-12-04 07:52
There are indeed some differences:
- Checksums just need to be different when the input is different (as often as possible), but it's almost as important that they're fast to compute.
- Hash codes (for use in hashtables) have the same requirements, and additionally they should be evenly distributed across the code space, especially for inputs that are similar.
- Cryptographic hashes have the much more stringent requirement that given a hash, you cannot construct an input that produces this hash. Computation times comes second, and depending on the applicatin it may even be desirable for the hash to be very slow to compute (in order to combat brute force attacks).
0 讨论(0)
发布评论:

提交评论
- 加载中...
予麋鹿

2020-12-04 07:52

A checksum protects against accidental changes.

A cryptographic hash protects against a very motivated attacker.

When you send bits on the wire, it may accidentally happen that some bits are either flipped, or deleted, or inserted. To allow the receiver to detect (or sometimes correct) accidents like this, the sender uses a checksum.

But if you assume there is someone actively and intelligently modifying the message on the wire and you want to protect against this sort of attacker, then use a cryptographic hash (I am ignoring cryptographically signing the hash, or using a secondary channel or such, since the question does not seem to elude to this).

0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2020-12-04 07:54
There is a different purpose behind each of them:
- Hash code - designed to be random across its domain (to minimize collisions in hash tables and such). Cryptographic hash codes are also designed to be computationally infeasible to reverse.
- Check sum - designed to detect the most common errors in the data and often to be fast to compute (for effective checksumming fast streams of data).
In practice, the same functions are often good for both purposes. In particular, a cryptographically strong hash code is a good checksum (it is almost impossible that a random error will break a strong hash function), if you can afford the computational cost.
0 讨论(0)
发布评论:

提交评论
- 加载中...
不知归路

2020-12-04 07:54
The difference between hash-code and checksum functions is, they are being designed for different purposes.
- A checksum is used to find out if something in the input has changed.
- A hash-code is used to find out if something in the input has changed and to have as much "distance" between individual hash-code values as possible.
  
  Also, there might be further requirements for a hash-function, in opposition to this rule, like the ability to form trees/clusters/buckets of hash-code values early.
  
  And if you add some shared initial randomization, you get to the concept for modern encryption/key-exchanges.
About Probability:

For example, lets assume that the input data actually always changes (100% of the time). And lets assume you have a "perfect" hash/checksum function, that generates a 1-bit hash/checksum value. Therefore, you will get different hash/checksum values, 50% of the time, for random input-data.
- If exactly 1 bit in your random input data has changed, you will be able to detect that 100% of the time, no matter how large the input data is.
- If 2 bits in your random input data have changed, your probability of detecting "a change" is divided by 2, because both changes could neutralize each other, and no hash/checksum function would detect that 2 bits are actually different in the input data.
  
  ...
This means, If the number of bits in your input data is multiple times larger than the number of bits in your hash/checksum value, your probability of actually getting different hash/checksum values, for different input values, gets reduced and is not a constant.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 3 下一页