Robust and fast checksum algorithm?

前端 未结 10 1623
失恋的感觉
失恋的感觉 2020-12-23 20:30

Which checksum algorithm can you recommend in the following use case?

I want to generate checksums of small JPEG files (~8 kB each) to check if the content changed.

10条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-23 21:03

    CRC32 is probably good enough, although there's a small chance you might get a collision, such that a file that has been modified might look like it hasn't been because the two versions generate the same checksum. To avoid this possibility I'd therefore suggest using MD5, which will easily be fast enough, and the chances of a collision occurring is reduced to the point where it's almost infinitessimal.

    As others have said, with lots of small files your real performance bottleneck is going to be I/O so the issue is dealing with that. If you post up a few more details somebody will probably suggest a way of sorting that out as well.

提交回复
热议问题