Sending Data From Elasticsearch to AWS Databases in Real Time

僤鯓⒐⒋嵵緔 提交于 2021-01-07 01:03:32

问题


I know this is a very different use case for Elasticsearch and I need your help.

Main structure (can't be changed):

  • There are some physical machines and we have sensors there. Data from these sensors are going to AWS Greengrass.
  • Then, with Lambda function data are going to Elasticsearch by using MQTT. Elasticsearch is running on the docker.

This is the structure and until here everything is ready and running ✅

Now, on the top of the ES I need some software that can send this data by using MQTT to Cloud database, for example DynamoDB.

But this is not one time migration. It should send the data continuously. Basically, I need a channel between ES and AWS DynamoDB.

Also, sensors are producing so much data and we don't want to store all of them in the Cloud but we want to store them in ES. Some filtering is needed in the Elasticsearch side before we send data to Cloud. Like "save every 10th data to cloud" so we can only save 1 data out of 10.

Do you have any idea about how can it be done? I have no experience in this field and it looks like a challenging task. I would love to get some suggestions from experienced people in these areas.

Thanks a lot! 🙌😊


回答1:


I haven’t worked on a similar use case but you can try looking into Logstash for this.

It's an open source service, part of ELK stack and provides the option of filtering the output. The pipeline will look something link below:

data ----> ES ----> Logstash -----> DynamoDB or any other destination.

It supports various plugins required for your use case, like:

  • DynamoDB output plugin - https://github.com/tellapart/logstash-output-dynamodb
  • Logstash MQTT Output Plugin - https://github.com/kompa3/logstash-output-mqtt


来源:https://stackoverflow.com/questions/65106106/sending-data-from-elasticsearch-to-aws-databases-in-real-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!