What is partition key in AWS Kinesis all about?

后端未结

关注

 2  1600

夕颜 2021-02-03 18:26

I was reading about AWS Kinesis. In the following program, I write data into the stream named TestStream. I ran this piece of code 10 times, inserting

2条回答

不要未来只要你来 (楼主)

2021-02-03 19:24

The accepted answer explains what are partition keys and and what they're used for in Kinesis (to decide to which shard to send the data to). Unfortunately, it does not explain why partition keys are needed in the first place.

In theory AWS could create a random partition key for each record which will result a near-perfect spread.

The real reason partitions are used is for "ordering/streaming". Kinesis maintains ordering (sequence number) for each shard.

In other words, by streaming X and afterwards Y to shard Z it is guaranteed, that X will be pulled from the stream before Y (when pulling records from all shards). On the other hand, by streaming X to shard Z1 and afterwards Y to shard Z2 there is no guarantee on the ordering (when pulling records from all shards). Y may definitely be pulled before X.

The shard "streaming" capability is useful in many cases.

(E.g. a video service streaming a movie to a user using the username and the movie name as the partition key).

(E.g. working on a stream of common events, and applying aggregation).

In cases where ordering (streaming) or grouping (e.g aggregation) is not required, generating a random partition key will suffice.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...