How to copy data from one HDFS to another HDFS?

后端 未结 6 1310
日久生厌
日久生厌 2021-01-30 11:30

I have two HDFS setup and want to copy (not migrate or move) some tables from HDFS1 to HDFS2. How to copy data from one HDFS to another HDFS? Is it possible via Sqoop or other c

6条回答
  •  不要未来只要你来
    2021-01-30 12:12

    DistCp (distributed copy) is a tool used for copying data between clusters. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list.

    Usage: $ hadoop distcp

    example: $ hadoop distcp hdfs://nn1:8020/file1 hdfs://nn2:8020/file2

    file1 from nn1 is copied to nn2 with filename file2

    Distcp is the best tool as of now. Sqoop is used to copy data from relational database to HDFS and vice versa, but not between HDFS to HDFS.

    More info:

    • http://hadoop.apache.org/docs/r1.2.1/distcp.html
    • http://hadoop.apache.org/docs/r1.2.1/distcp2.html

    There are two versions available - runtime performance in distcp2 is more compared to distcp

提交回复
热议问题