问题
I know this is a very different use case for Elasticsearch and I need your help.
Main structure (can't be changed):
- There are some physical machines and we have sensors there. Data from these sensors are going to AWS Greengrass.
- Then, with Lambda function data are going to Elasticsearch by using MQTT. Elasticsearch is running on the docker.
This is the structure and until here everything is ready and running ✅
Now, on the top of the ES I need some software that can send this data by using MQTT to Cloud database, for example DynamoDB.
But this is not one time migration. It should send the data continuously. Basically, I need a channel between ES and AWS DynamoDB.
Also, sensors are producing so much data and we don't want to store all of them in the Cloud but we want to store them in ES. Some filtering is needed in the Elasticsearch side before we send data to Cloud. Like "save every 10th data to cloud" so we can only save 1 data out of 10.
Do you have any idea about how can it be done? I have no experience in this field and it looks like a challenging task. I would love to get some suggestions from experienced people in these areas.
Thanks a lot! 🙌😊
回答1:
I haven’t worked on a similar use case but you can try looking into Logstash for this.
It's an open source service, part of ELK stack and provides the option of filtering the output. The pipeline will look something link below:
data ----> ES ----> Logstash -----> DynamoDB or any other destination.
It supports various plugins required for your use case, like:
- DynamoDB output plugin - https://github.com/tellapart/logstash-output-dynamodb
- Logstash MQTT Output Plugin - https://github.com/kompa3/logstash-output-mqtt
来源:https://stackoverflow.com/questions/65106106/sending-data-from-elasticsearch-to-aws-databases-in-real-time