Python, get base64-encoded MD5 hash of an image object

前端 未结 2 1762
难免孤独
难免孤独 2021-01-04 11:38

I need to get a base64-encoded MD5 hash of an object, where the object is an image stored as a file, fname.

I\'ve tried this:

def get_md5(fname):
            


        
相关标签:
2条回答
  • 2021-01-04 11:58

    First, base64 encoding makes strings longer. (Example using IPython with Python 3):

    In [1]: s = '123456789012345678901234'
    
    In [2]: len(s)
    Out[2]: 24
    
    In [3]: import base64
    
    In [4]: e = base64.b64encode(s.encode('utf8'))
    
    In [5]: len(e)
    Out[5]: 32
    
    In [6]: e
    Out[6]: b'MTIzNDU2Nzg5MDEyMzQ1Njc4OTAxMjM0'
    

    With base64 encoding you get 8 bits of output for every 6 bits of input.

    In [7]: 32/24
    Out[7]: 1.333
    
    In [8]: 8/6
    Out[8]: 1.333
    

    The base64 alphabet uses 64 (or 2**6) different symbols. Generally they include lower- and uppercase letters, the digits 0-9. This leaves two extra required symbols and a pading character. Often + and / are used as symbols, but there are variations. Especially since / is not allowed in UNIX or MS-Windows filenames.

    Second, using a hexadecimal representation doubles the length of a byte string; the hex representation of one byte can vary between 00 and FF. Example (again using IPython and Python 3):

    In [1]: import hashlib
    
    In [2]: s = b'this is a simple test'
    
    In [3]: len(hashlib.md5(s).digest())
    Out[3]: 16
    
    In [4]: len(hashlib.md5(s).hexdigest())
    Out[4]: 32
    

    If you are going to use base64 encoding anyway, it makes no sense to use hexdigest().

    0 讨论(0)
  • 2021-01-04 12:12

    I was able to make it work by using digest() instead of hexdigest(). Then the last line becomes:

    return hash.digest().encode('base64').strip()
    

    The result was then 24 characters long, and it was accepted by Google Cloud Storage transfer, which required a base64-encoded MD5 hash.

    0 讨论(0)
提交回复
热议问题