Backup DynamoDB Table with dynamic columns to S3

江枫思渺然 提交于 2019-12-12 05:29:56

问题


I have read several other posts about this and in particular this question with an answer by greg about how to do it in Hive. I would like to know how to account for DynamoDB tables with variable amounts of columns though?

That is, the original DynamoDB table has rows that were added dynamically with different columns. I have tried to view the exportDynamoDBToS3 script that Amazon uses in their DataPipeLine service but it has code like the following which does not seem to map the columns:

-- Map DynamoDB Table
CREATE EXTERNAL TABLE dynamodb_table (item map<string,string>)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" = "MyTable");

(As an aside, I have also tried using the Datapipe system but found it rather frustrating as I could not figure out from the documentation how to perform simple tasks like run a shell script without everything failing.)


回答1:


It turns out that the Hive script that I posted in the original question works just fine but only if you are using the correct version of Hive. It seems that even with the install-hive command set to install the latest version, the version used is actually dependent on the AMI Version.

After doing a fair bit of searching I managed to find the following in Amazon's docs (emphasis mine):

Create a Hive table that references data stored in Amazon DynamoDB. This is similar to the preceding example, except that you are not specifying a column mapping. The table must have exactly one column of type map. If you then create an EXTERNAL table in Amazon S3 you can call the INSERT OVERWRITE command to write the data from Amazon DynamoDB to Amazon S3. You can use this to create an archive of your Amazon DynamoDB data in Amazon S3. Because there is no column mapping, you cannot query tables that are exported this way. Exporting data without specifying a column mapping is available in Hive 0.8.1.5 or later, which is supported on Amazon EMR AMI 2.2.3 and later.

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMR_Hive_Commands.html



来源:https://stackoverflow.com/questions/15845286/backup-dynamodb-table-with-dynamic-columns-to-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!