S3 - What Exactly Is A Prefix? And what Ratelimits apply?

后端 未结 6 2082
[愿得一人]
[愿得一人] 2020-11-30 22:48

I was wondering if anyone knew what exactly an s3 prefix was and how it interacts with amazon\'s published s3 rate limits:

Amazon S3 automatically sca

6条回答
  •  死守一世寂寞
    2020-11-30 23:11

    S3 prefixes used to be determined by the first 6-8 characters;

    This has changed mid-2018 - see announcement https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/

    But that is half-truth. Actually prefixes (in old definition) still matter.

    S3 is not a traditional “storage” - each directory/filename is a separate object in a key/value object store. And also the data has to be partitioned/ sharded to scale to quadzillion of objects. So yes this new sharding is kinda of “automatic”, but not really if you created a new process that writes to it with crazy parallelism to different subdirectories. Before the S3 learns from the new access pattern, you may run into S3 throttling before it reshards/ repartitions data accordingly.

    Learning new access patterns takes time. Repartitioning of the data takes time.

    Things did improve in mid-2018 (~10x throughput-wise for a new bucket with no statistics), but it's still not what it could be if data is partitioned properly. Although to be fair, this may not be applied to you if you don't have a ton of data, or pattern how you access data is not hugely parallel (e.g. running a Hadoop/Spark cluster on many Tbs of data in S3 with hundreds+ of tasks accessing same bucket in parallel).

    TLDR:

    "Old prefixes" still do matter. Write data to root of your bucket, and first-level directory there will determine "prefix" (make it random for example)

    "New prefixes" do work, but not initially. It takes time to accommodate to load.

    PS. Another approach - you can reach out to your AWS TAM (if you have one) and ask them to pre-partition a new S3 bucket if you expect a ton of data to be flooding it soon.

提交回复
热议问题