Merging multiple files into one within Hadoop

前端 未结 8 827
遇见更好的自我
遇见更好的自我 2020-12-01 02:18

I get multiple small files into my input directory which I want to merge into a single file without using the local file system or writing mapreds. Is there a way I could do

8条回答
  •  离开以前
    2020-12-01 02:58

    Addressing this from Apache Pig perspective,

    To merge two files with identical schema via Pig, UNION command can be used

     A = load 'tmp/file1' Using PigStorage('\t') as ....(schema1)
     B = load 'tmp/file2' Using PigStorage('\t') as ....(schema1) 
     C = UNION A,B
     store C into 'tmp/fileoutput' Using PigStorage('\t')
    

提交回复
热议问题