Mapreduce XML input format - to build custom format
问题 If the input files in XML format, I shouldn't be using TextInputFormat because TextInputFormat assumes each record is in each line of the input file and the Mapper class is called for each line to get a Key Value pair for that record/line. So I think we need a custom input format to scan the XML datasets. Being new to Hadoop mapreduce, is there any article/link/video that shows the steps to build a custom input format? thanks nath 回答1: Problem Working on a single XML file in parallel in