amazon-kinesis

How to decide total number of partition keys in AWS kinesis stream?

偶尔善良 提交于 2019-11-28 10:56:16
In a producer-consumer web application, what should be the thought process to create a partition key for a kinesis stream shard. Suppose, I have a kinesis stream with 16 shards, how many partition keys should I create? Is it really dependent on the number of shards? Partition (or Hash) Key: starts from 1 up to 340282366920938463463374607431768211455. Lets say ~34020 * 10^34, I will omit 10^34 for ease... If you have 30 shards, uniformly divided, each should cover 1134 * 10^34 hash keys. The coverage should be like this. Shard-00: 0 - 1134 Shard-01: 1135 - 2268 Shard-03: 2269 - 3402 Shard-04:

Why should I use Amazon Kinesis and not SNS-SQS?

折月煮酒 提交于 2019-11-28 02:51:59
I have a use case where there will be stream of data coming and I cannot consume it at the same pace and need a buffer. This can be solved using an SNS-SQS queue. I came to know the Kinesis solves the same purpose, so what is the difference? Why should I prefer (or should not prefer) Kinesis? E.J. Brennan On the surface they are vaguely similar, but your use case will determine which tool is appropriate. IMO, if you can get by with SQS then you should - if it will do what you want, it will be simpler and cheaper, but here is a better explanation from the AWS FAQ which gives examples of

Application report for application_ (state: ACCEPTED) never ends for Spark Submit (with Spark 1.2.0 on YARN)

岁酱吖の 提交于 2019-11-27 18:42:05
I am running kinesis plus spark application https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html I am running as below command on ec2 instance : ./spark/bin/spark-submit --class org.apache.spark.examples.streaming.myclassname --master yarn-cluster --num-executors 2 --driver-memory 1g --executor-memory 1g --executor-cores 1 /home/hadoop/test.jar I have installed spark on EMR. EMR details Master instance group - 1 Running MASTER m1.medium 1 Core instance group - 2 Running CORE m1.medium I am getting below INFO and it never ends. 15/06/14 11:33:23 INFO yarn.Client: Requesting a

Why should I use Amazon Kinesis and not SNS-SQS?

柔情痞子 提交于 2019-11-26 23:50:57
问题 I have a use case where there will be stream of data coming and I cannot consume it at the same pace and need a buffer. This can be solved using an SNS-SQS queue. I came to know the Kinesis solves the same purpose, so what is the difference? Why should I prefer (or should not prefer) Kinesis? 回答1: On the surface they are vaguely similar, but your use case will determine which tool is appropriate. IMO, if you can get by with SQS then you should - if it will do what you want, it will be simpler

Application report for application_ (state: ACCEPTED) never ends for Spark Submit (with Spark 1.2.0 on YARN)

别等时光非礼了梦想. 提交于 2019-11-26 19:33:35
问题 I am running kinesis plus spark application https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html I am running as below command on ec2 instance : ./spark/bin/spark-submit --class org.apache.spark.examples.streaming.myclassname --master yarn-cluster --num-executors 2 --driver-memory 1g --executor-memory 1g --executor-cores 1 /home/hadoop/test.jar I have installed spark on EMR. EMR details Master instance group - 1 Running MASTER m1.medium 1 Core instance group - 2 Running CORE