hadoop copying from hdfs to S3
I've successfully completed mahout vectorizing job on Amazon EMR (using Mahout on Elastic MapReduce as reference). Now I want to copy results from HDFS to S3 (to use it in future clustering). For that I've used hadoop distcp: den@aws:~$ elastic-mapreduce --jar s3://elasticmapreduce/samples/distcp/distcp.jar \ > --arg hdfs://my.bucket/prj1/seqfiles \ > --arg s3n://ACCESS_KEY:SECRET_KEY@my.bucket/prj1/seqfiles \ > -j $JOBID Failed. Found that suggestion: Use s3distcp Tried it also: elastic-mapreduce --jobflow $JOBID \ > --jar --arg s3://eu-west-1.elasticmapreduce/libs/s3distcp/1.latest/s3distcp