amazon-kinesis

Spark not able to fetch events from Amazon Kinesis

喜你入骨 提交于 2019-12-08 02:47:42
问题 I have been trying to get Spark read events from Kinesis recently but am having problem in receiving the events. While Spark is able to connect to Kinesis and is able to get metadata from Kinesis, Its not able to get events from it. It always fetches zero elements back. There are no errors, just empty results back. Spark is able to get metadata (Eg. number of shards in kinesis etc). I have used these [1 & 2] guides for getting it working but have not got much luck yet. I have also tried

Using Kinesis Analytics to construct real time sessions

和自甴很熟 提交于 2019-12-08 00:39:18
问题 Is there an example somewhere or can someone explain how to using Kinesis Analytics to construct real time sessions. (ie sessionization) It is mentioned that this possible here: https://aws.amazon.com/blogs/aws/amazon-kinesis-analytics-process-streaming-data-in-real-time-with-sql/ in the discussion of custom windows but does not give an example. Typically this is done in SQL using the LAG function so you can compute the time difference between consecutive rows. This post: https://blog

Can I invoke Lambda functions in parallel using a single Kinesis shard if record order doesn't matter?

假装没事ソ 提交于 2019-12-07 19:26:52
问题 I've got an application for which I only need the bandwidth of 1 Kinesis shard, but I need many lambda function invocations in parallel to keep up with the record processing. My record size is on the high end (some of them encroach on the 1000 KB limit), but the incoming rate is only 1 MB/s, as I'm using a single EC2 instance to populate the stream. Since each record contains an internal timestamp, I don't care about processing them in order. Basically I have several months' worth of data

Kinesis partition key falls always in the same shard

浪子不回头ぞ 提交于 2019-12-07 10:47:21
问题 I have a kinesis stream with 2 shards that looks like this: { "StreamDescription": { "StreamStatus": "ACTIVE", "StreamName": "my-stream", "Shards": [ { "ShardId": "shardId-000000000001", "HashKeyRange": { "EndingHashKey": "17014118346046923173168730371587", "StartingHashKey": "0" }, { "ShardId": "shardId-000000000002", "HashKeyRange": { "EndingHashKey": "340282366920938463463374607431768211455", "StartingHashKey": "17014118346046923173168730371588" }, ] } } The sender side sets a partition

Kinesis Firehose putting JSON objects in S3 without seperator comma

安稳与你 提交于 2019-12-07 08:12:58
问题 Before sending the data I am using JSON.stringify to the data and it looks like this {"data": [{"key1": value1, "key2": value2}, {"key1": value1, "key2": value2}]} But once it passes through AWS API Gateway and Kinesis Firehose puts it to S3 it looks like this { "key1": value1, "key2": value2 }{ "key1": value1, "key2": value2 } The seperator comma between the JSON objects are gone but I need it to process data properly. Template in the API Gateway: #set($root = $input.path('$')) {

Shard [shardId-000000000000] is not closed. This can happen if we constructed the list of shards while a reshard operation was in progress

余生颓废 提交于 2019-12-07 07:00:51
问题 I am getting this error while fetching data from Amazon kinesis Stream. I am doing below steps creating amazon kinesis Steam put the data using putRecord api of AmazonKinesisClient . Then using Worker Of KCL library to get the data from stream. 回答1: There are a few possibilities. After you ordered to create the stream, did you wait long enough for completion? Sometimes, it may took 10 minutes for a shard to be created. Since you managed to use putRecord method, the stream and shard should be

Kinesis: What is the best/safe way to shutdown a worker?

纵饮孤独 提交于 2019-12-07 05:54:01
问题 I am using the AWS Kinesis Client Library. I need a way to shutdown Kinesis Worker thread during deployments so, that I stop at a checkpoint and not in the middle of processRecords() . I see a shutdown boolean present in Worker.java but it is made private. The reason I need is that checkpointing and idempotency is critical to me and I don't want to kill the process in the middle of a batch. [EDIT] Thanks to @CaptainMurphy, I noticed that Worker.java exposes shutdown() method which safely

AWS API signed POST request with Javascript

守給你的承諾、 提交于 2019-12-07 00:52:27
What I'm trying to do: Ultimately: I want to populate an AWS Kinesis stream from a browser extension (Safari, Chrome). I need to send the request to AWS using a signing process (v4); this involves setting headers and encrypting them (on a distant server with the aws secret key) to finally join those to the request. Amazon requests the header "Host" to be explicitly defined… However Javascript strictly disallow setting it (and a bunch of others, for good reasons) I must be missing something—how can I do this? sources: http://docs.aws.amazon.com/general/latest/gr/sigv4-signed-request-examples

How to set JVM arguments in IntelliJ IDEA?

落花浮王杯 提交于 2019-12-06 18:49:07
问题 I am confused about the instruction when using Kinesis Video Stream Run DemoAppMain.java in ./src/main/demo with JVM arguments set to -Daws.accessKeyId={YourAwsAccessKey} -Daws.secretKey={YourAwsSecretKey} -Djava.library.path={NativeLibraryPath} for non-temporary AWS credential. How to set these arguments in IntelliJ IDEA? I followed the documentation and found the "Run/Debug Configurations" and don't know what to do next. Any help? Thanks! 回答1: You're correct about the Run/Debug

Apache Flink - how to send and consume POJOs using AWS Kinesis

十年热恋 提交于 2019-12-06 07:34:30
I want to consume POJOs arriving from Kinesis with Flink. Is there any standard for how to correctly send and deserialize the messages? Thanks I resolved it with: DataStream<SamplePojo> kinesis = see.addSource(new FlinkKinesisConsumer<>( "my-stream", new POJODeserializationSchema(), kinesisConsumerConfig)); and public class POJODeserializationSchema extends AbstractDeserializationSchema<SamplePojo> { private ObjectMapper mapper; @Override public SamplePojo deserialize(byte[] message) throws IOException { if (mapper == null) { mapper = new ObjectMapper(); } SamplePojo retVal = mapper.readValue