how to output first row as column qualifier names

眉间皱痕 提交于 2019-12-08 12:28:51

问题


I am able to process two nodes from an xml. And I am getting the output below:

bin/hadoop fs -text /user/root/t-output1/part-r-00000
    name:ST17925 currentgrade 1.02
    name:ST17926 currentgrade 3.0
    name:ST17927 currentgrade 3.0

but I need to have an output like:

studentid curentgrade
ST17925 1.02
ST17926 3.00
ST17927 3.00

How can I achieve this?

My complete source code: https://github.com/studhadoop/xml/blob/master/XmlParser11.java

EDIT: Solution

protected void setup(Context context) throws IOException, InterruptedException {
    context.write(new Text("studentid"), new Text("currentgrade"));            
  }

回答1:


I think it is difficult to do this along with your MapReduce code. The reasons is

  1. The headers may not be of the same data types
  2. If the types are same, you can write headers from the setup() method of Reducer code but there is no guarantee that the headers will appear as the first row in the output.

At best what you can do is, create a separate HDFS/ local file with the headers in your map code on the first encounter of the column qualifiers. You need to use appropriate file operations API for creating this file. Later when the job is complete you can use these headers in other programs or merge them together as a single file.



来源:https://stackoverflow.com/questions/16330413/how-to-output-first-row-as-column-qualifier-names

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!