How to check MD5 when downloading object from GCS

一笑奈何 提交于 2019-12-24 23:59:18

问题


I wanted to do MD5 check when I download file from GCS. However, it seems that I didn't get the correct MD5 my on side..... One example s that got :

local 1B2M2Y8AsgTpgAmY7PhCfg==, cloud JWSLJAR+M3krp1RiOAJzOw==

But I'm pretty sure the file isn't corrupted...

The following code are with C#7.0, using System.Security.Cryptography;

           using (var memStream = new MemoryStream())
            {
                _StorageClient.DownloadObject(bucketName, gcsObj.Name, memStream);
                try
                {
                    using (var md5 = MD5.Create())
                    {
                        var hash = md5.ComputeHash(memStream);
                        localMd5 = Convert.ToBase64String(hash);
                    }
                    Console.WriteLine($"local {localMd5}, cloud {gcsObj.Md5Hash}");
                }
                catch
                {
                    Console.WriteLine("Error getting md5 checksum");
                }
            }

Another question is: the c# lib that I tried to get the CRC32C value of a file only return an uint type, but the GCS object's Crc32C value is a string. How to compare them?


回答1:


From your sample, I'm assuming your sample hash comes from the x-goog-hash header?

If that is the case, can you check what is the value x-goog-stored-content-encoding for the same file? If it is gzip, you uploaded a compressed copy to GCS and it is stored in gzip format. In that case, x-goog-hash is the MD5 of the gzipped copy stored on GCS.

To verify it you'd have to download the compressed version (not sure if that's possible with the C# library you're using), and check the MD5 hash of that.


For the CRC32C, you can use this:

Convert.ToBase64String(BitConverter.GetBytes(crc32c))

But the same thing applies: if it is gziped, this is the CRC32C of the gzipped version.


To check object metadata you can use:

gsutil stat gs://some-bucket/some-object

Sample output:

Creation time:          Sat, 20 Jan 2018 11:09:11 GMT
Update time:            Sat, 20 Jan 2018 11:09:11 GMT
Storage class:          MULTI_REGIONAL
Content-Encoding:       gzip
Content-Length:         5804
Content-Type:           application/msword
Hash (crc32c):          kxvpkw==
Hash (md5):             bfH75gryTXKgNosp1Smxvw==
ETag:                   CO7sotCz5tgCEAE=
Generation:             1516446551684718
Metageneration:         1

This object is stored in gzip format. Neither MD5/CRC32C will match those of the decompressed copy.




回答2:


You should not use Convert.ToBase64String method.

Try this instead:

static string Md5HashToString(byte[] hash)
{
    // Create a new StringBuilder to collect the bytes
    // and create a string.
    StringBuilder sBuilder = new StringBuilder();

    // Loop through each byte of the hashed data 
    // and format each one as a hexadecimal string.
    for (int i = 0; i < hash.Length; i++)
    {
        sBuilder.Append(hash[i].ToString("x2"));
    }

    // Return the hexadecimal string.
    return sBuilder.ToString();
}


来源:https://stackoverflow.com/questions/48268822/how-to-check-md5-when-downloading-object-from-gcs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!