Im working on Hive tables im having the following problem. I am having more than 1 billion of xml files in my HDFS. What i want to do is, Each xml file having the 4 differe
You have several options:
CREATE TABLE xmlfiles (id int, xmlfile string). Then use an XPath UDF to do work on the XML.//section1), follow the instructions in the second half of this tutorial to ingest directly into Hive via XPath.It depends on your level of experience and comfort with these approaches.
Use this:
CREATE EXTERNAL TABLE test(name STRING) LOCATION '/user/sornalingam/zipped/output/Tagged/t1'
tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="1");
And then use xpath function