I am trying to compute md5 hash of a file with the function hashlib.md5() from hashlib module.
So that I writed this piece of code:
To be able to handle arbitrarily large files you need to read them in blocks. The size of such blocks should preferably be a power of 2, and in the case of md5 the minimum possible block consists of 64 bytes (512 bits) as 512-bit blocks are the units on which the algorithm operates.
But if we go beyond that and try to establish an exact criterion whether, say 2048-byte block is better than 4096-byte block... we will likely fail. This needs to be very carefully tested and measured, and almost always the value is being chosen at will, judging from experience.