Getting data in and out of Elastic MapReduce HDFS
问题 I've written a Hadoop program which requires a certain layout within HDFS, and which afterwards, I need to get the files out of HDFS. It works on my single-node Hadoop setup and I'm eager to get it working on 10's of nodes within Elastic MapReduce. What I've been doing is something like this: ./elastic-mapreduce --create --alive JOBID="j-XXX" # output from creation ./elastic-mapreduce -j $JOBID --ssh "hadoop fs -cp s3://bucket-id/XXX /XXX" ./elastic-mapreduce -j $JOBID --jar s3://bucket-id