amazon-kinesis

What is the streaming log data latency between AWS & Google cloud services?

谁都会走 提交于 2019-12-02 01:31:52
Has anyone had experience with: Sending streamed/micro-batched log data from Amazon to BigQuery to process and can shed light on any latency issue? Sending (micro-batched) logs from Google DataFlow to Amazon (Kinesis / S3 / DynamoDB) Can someone provide info on latency? Thanks In question 1, I believe you're interested in BigQuery ingestion latency. Per Streaming Data into BigQuery , Streamed data is available for real-time analysis within a few seconds of the first streaming insertion into a table. This latency is low, but it will probably dominate whatever latency you have due to raw network

Kafka like offset on Kinesis Stream?

别来无恙 提交于 2019-11-30 23:07:31
I have worked a bit with Kafka in the past and lately there is a requirement to port part of the data pipeline on AWS Kinesis Stream. Now I have read that Kinesis is effectively a fork of Kafka and share many similarities. However I have failed to see how can we have multiple consumers reading from the same stream, each with their corresponding offset. There is a sequence number given to each data record, but I couldn't find anything specific to consumer(Kafka group Id?). Is it really possible to have different consumers with different ingestion rate over same AWS Kinesis Stream? Yes. You can

Boto3 kinesis video stream: Error when calling the GetMedia operation

余生长醉 提交于 2019-11-30 19:42:17
问题 Under the boto3 docs: https://boto3.readthedocs.io/en/latest/reference/services/kinesis-video-media.html#KinesisVideoMedia.Client.get_media it says that I need to run the GetDataEndpoint API first to get an endpoint before I run GetMedia but it doesn't say how to feed that endpoint in? So I have tried to run: import boto3 kinesis_media = boto3.client('kinesis-video-media' region_name='region') stream = kinesis_media.get_media(StreamARN='my-arn', StartSelector={'StartSelectorType': 'EARLIEST'}

Kafka like offset on Kinesis Stream?

大兔子大兔子 提交于 2019-11-30 18:10:17
问题 I have worked a bit with Kafka in the past and lately there is a requirement to port part of the data pipeline on AWS Kinesis Stream. Now I have read that Kinesis is effectively a fork of Kafka and share many similarities. However I have failed to see how can we have multiple consumers reading from the same stream, each with their corresponding offset. There is a sequence number given to each data record, but I couldn't find anything specific to consumer(Kafka group Id?). Is it really

Difference between Kinesis Stream and DynamoDB streams

≡放荡痞女 提交于 2019-11-30 17:23:55
They seem to be doing the same thing to me. Can anyone explain to me the difference? High level difference between the two: Kinesis Streams allows you to produce and consume large volumes of data(logs, web data, etc), where DynamoDB Streams is a feature local to DynamoDB that allows you to see the granular changes to your DynamoDB table items. More details: Amazon Kinesis Streams Amazon Kinesis Streams is part of Big Data suite of services at AWS. From the developer documentation : You can use Streams for rapid and continuous data intake and aggregation. The type of data used includes IT

How to fanout an AWS kinesis stream?

不羁的心 提交于 2019-11-30 16:53:02
问题 I'd like to fanout/chain/replicate an Input AWS Kinesis stream To N new Kinesis streams , So that each record written to the input Kinesis will appear in each of the N streams. Is there an AWS service or an open source solution ? I prefer not to write code to do that if there's a ready-made solution. AWS Kinesis firehose is a no solution because it can't output to kinesis. Perhaps a AWS Lambda solution if that won't be too expensive to run? 回答1: There are two ways you could accomplish fan-out

TRIM_HORIZON vs LATEST

♀尐吖头ヾ 提交于 2019-11-30 14:57:06
问题 I can't find in the formal documentation of AWS Kinesis any explicit reference between TRIM_HORIZON and the checkpoint, and also any reference between LATEST and the checkpoint. Can you confirm my theory: TRIM_HORIZON - In case the application-name is new, then I will read all the records available in the stream. Else, application-name was already used, then I will read from my last checkpoint. LATEST - In case the application-name is new, then I will read all the records in the stream which

AWS Lambda can't connect to RDS instance, but I can locally?

北城以北 提交于 2019-11-30 01:48:10
问题 I am trying to connect to my RDS instance from a lambda. I wrote the lambda locally and tested locally, and everything worked peachy. I deploy to lambda, and suddenly it doesn't work. Below is the code I'm running, and if it helps, I'm invoking the lambda via a kinesis stream. 'use strict'; exports.handler = (event, context, handlerCallback) => { console.log('Recieved request for kinesis events!'); console.log(event); console.log(context); const connectionDetails = { host: RDS_HOST, port:

Call REST API for Amazon Kinesis with Setting up API Gateway

眉间皱痕 提交于 2019-11-29 11:35:31
I am trying to send a HTTP Post Request to put a record into Amazon Kinesis Stream. There are several ways (Kinesis Client, KPL, setting up AWS Gateway as Kinesis Proxy). I saw this document about Kinesis PutRecord API http://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecord.html POST / HTTP/1.1 Host: kinesis.<region>.<domain> Content-Length: <PayloadSizeBytes> User-Agent: <UserAgentString> Content-Type: application/x-amz-json-1.1 Authorization: <AuthParams> Connection: Keep-Alive X-Amz-Date: <Date> X-Amz-Target: Kinesis_20131202.PutRecord { "StreamName": "exampleStreamName", "Data

Apache Spark Kinesis Sample not working

别说谁变了你拦得住时间么 提交于 2019-11-29 08:59:32
I am trying to run the JavaKinesisWordCountASL example. The example seem to connect to my Kinesis Stream and gets data from the stream (as shown in the log below). However, Sparks does not invoke the call function passed to the unionStreams.flatMap method in the example and does not prints any wordcount. I have tried running using both Java 8 and Java 7. I am running it on an ubuntu instance. The same example works on my macbook. 14/11/15 01:59:42 INFO scheduler.ReceiverTracker: Stream 1 received 0 blocks 14/11/15 01:59:42 INFO storage.MemoryStore: ensureFreeSpace(264) called with curMem=3512,