DynamoDB InputFormat for Hadoop

不问归期 提交于 2019-12-06 07:48:31

问题


I have to process some data which is persisted in Amazon Dynamo DB using Hadoop map reduce.

I was searching over internet for Hadoop InputFormat for Dynamo DB and couldn't find it. I'm not familiar with Dynamo DB so I'm guessing there is some trick related to DynamoDB and Hadoop? If there is anywhere implementation of this Input Format could you please share it?


回答1:


After a lot of searching I found DynamoDBInputFormat and DynamoDBOutputFormat in one of Amazon's libraries.

On amazon elastic map reduce there is library called hive-bigbird-handler which contains input and output format for dynamoDB. Full class names are: org.apache.hadoop.hive.dynamodb.write.DynamoDBOutputFormat and org.apache.hadoop.hive.dynamodb.read.DynamoDBInputFormat

I hope these classes will be useful to community.




回答2:


Couldn't find an InputFormat which you could use directly in MapReduce. But, here is an article AWS HowTo: Using Amazon Elastic MapReduce with DynamoDB (Guest Post) to run MarReduce jobs using Hive.



来源:https://stackoverflow.com/questions/13020104/dynamodb-inputformat-for-hadoop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!