AWS service to verify data integrity of file in S3 via checksum?

无人久伴 提交于 2020-01-30 08:05:07

问题


One method of ensuring a file in S3 is what it claims to be is to download it, get its checksum, and match the result against the checksum you were expecting.

Does AWS provide any service that allows this to happen without the user needing to first download the file? (i.e. ideally a simple request/url that provides the checksum of an S3 file, so that it can be verified before the file is downloaded)

What I've tried so far

I can think of a DIY solution along the lines of

  • Create an API endpoint that accepts a POST request with the S3 file url
  • Have the API run a lambda that generates the checksum of the file
  • Respond with the checksum value

This may work, but is already a little complicated and would have further considerations, e.g. large files may take a long time to generate a checksum (e.g. > 60 seconds)

I'm hoping AWS have some simple way of validating S3 files?


回答1:


There is an ETag created against each object, which is an MD5 of the object contents.

However, there seems to be some exceptions.

From Common Response Headers - Amazon Simple Storage Service:

ETag: The entity tag is a hash of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. Whether or not it is depends on how the object was created and how it is encrypted as described below:

  • Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-S3 or plaintext, have ETags that are an MD5 digest of their object data.

  • Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-C or SSE-KMS, have ETags that are not an MD5 digest of their object data.

  • If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption.

Also, the calculation of an ETag for a multi-part upload can be complex. See: s3cmd - What is the algorithm to compute the Amazon-S3 Etag for a file larger than 5GB? - Stack Overflow



来源:https://stackoverflow.com/questions/59817303/aws-service-to-verify-data-integrity-of-file-in-s3-via-checksum

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!