How to calculate the entropy of a file?

前端 未结 11 1339
野趣味
野趣味 2020-11-28 20:16

How to calculate the entropy of a file? (Or let\'s just say a bunch of bytes)
I have an idea, but I\'m not sure that it\'s mathematically correct.

My id

11条回答
  •  生来不讨喜
    2020-11-28 20:34

    • At the end: Calculate the "average" value for the array.
    • Initialize a counter with zero, and for each of the array's entries: add the entry's difference to "average" to the counter.

    With some modifications you can get Shannon's entropy:

    rename "average" to "entropy"

    (float) entropy = 0
    for i in the array[256]:Counts do 
      (float)p = Counts[i] / filesize
      if (p > 0) entropy = entropy - p*lg(p) // lgN is the logarithm with base 2
    

    Edit: As Wesley mentioned, we must divide entropy by 8 in order to adjust it in the range 0 . . 1 (or alternatively, we can use the logarithmic base 256).

提交回复
热议问题