Clear All Existing Entries In DynamoDB Table In AWS Data Pipeline

两盒软妹~` 提交于 2021-02-11 13:38:17

问题


My goal is to take daily snapshots of an RDS table and put it in a DynamoDB table. The table should only contain data from a single day.

For this have a Data Pipeline set up to query a RDS table and publish the results into S3 in CSV format.

Then a HiveActivity imports this CSV into a DynamoDB table by creating external tables for the file and an existing DynamoDB table.

This works great, but older entries from the previous day still exist in the DynamoDB table. I want to do this within Data Pipeline if at all possible. I need to:

1) Find a way to clear the DynamoDB table, or at least drop/recreate it, or 2) Include an extra column of the snapshot date and find a way to clear out all older entries.

Any ideas on how I can do this?


回答1:


You can use DynamoDb Time to Live(TTL) which allows you to set an expiration time after which items are auto deleted from the DynamoDb table. TTL is very useful for cases where data loses it's relevance after a specific time period and in your case it can be start time of next day.



来源:https://stackoverflow.com/questions/49963481/clear-all-existing-entries-in-dynamodb-table-in-aws-data-pipeline

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!