Using GroupBy while copying from HDFS to S3 to merge files within a folder
问题 I have the following folders in HDFS : hdfs://x.x.x.x:8020/Air/BOOK/AE/DOM/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/AE/INT/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/BH/INT/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/IN/DOM/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/IN/INT/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/KW/DOM/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/KW/INT/20171001/2017100101 hdfs://x.x.x.x:8020/Air/BOOK/ME/INT/20171001/2017100101 hdfs://x.x