I have made a PHP script that parses an XML file. This is not easy to use and I wanted to implement it in Java.
Inside the first element there are various count of
You have few variations how to implement XML parsing at Java.
The most common is: DOM, SAX, StAX.
Everyone one has pros and cons. With Dom and Sax you able to validate your xml with xsd schema. But Stax works without xsd validation, and much faster.
For example, xml file:
Carl Cracker
75000
Harry Hacker
50000
Tony Tester
40000
The longest at implementation (to my mind) DOM parser:
class DomXmlParser {
private Document document;
List empList = new ArrayList<>();
public SchemaFactory schemaFactory;
public final String JAXP_SCHEMA_LANGUAGE = "http://java.sun.com/xml/jaxp/properties/schemaLanguage";
public final String W3C_XML_SCHEMA = "http://www.w3.org/2001/XMLSchema";
public DomXmlParser() {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
factory.setAttribute(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
DocumentBuilder builder = factory.newDocumentBuilder();
document = builder.parse(new File(EMPLOYEE_XML.getFilename()));
} catch (Exception e) {
e.printStackTrace();
}
}
public List parseFromXmlToEmployee() {
NodeList nodeList = document.getDocumentElement().getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node instanceof Element) {
Employee emp = new Employee();
NodeList childNodes = node.getChildNodes();
for (int j = 0; j < childNodes.getLength(); j++) {
Node cNode = childNodes.item(j);
// identify the child tag of employees
if (cNode instanceof Element) {
switch (cNode.getNodeName()) {
case "name":
emp.setName(text(cNode));
break;
case "salary":
emp.setSalary(Double.parseDouble(text(cNode)));
break;
case "hiredate":
int yearAttr = Integer.parseInt(cNode.getAttributes().getNamedItem("year").getNodeValue());
int monthAttr = Integer.parseInt(cNode.getAttributes().getNamedItem("month").getNodeValue());
int dayAttr = Integer.parseInt(cNode.getAttributes().getNamedItem("day").getNodeValue());
emp.setHireDay(yearAttr, monthAttr - 1, dayAttr);
break;
}
}
}
empList.add(emp);
}
}
return empList;
}
private String text(Node cNode) {
return cNode.getTextContent().trim();
}
}
SAX parser:
class SaxHandler extends DefaultHandler {
private Stack elementStack = new Stack<>();
private Stack
Stax parser:
class StaxXmlParser {
private List employeeList;
private Employee currentEmployee;
private String tagContent;
private String attrContent;
private XMLStreamReader reader;
public StaxXmlParser(String filename) {
employeeList = null;
currentEmployee = null;
tagContent = null;
try {
XMLInputFactory factory = XMLInputFactory.newFactory();
reader = factory.createXMLStreamReader(new FileInputStream(new File(filename)));
parseEmployee();
} catch (Exception e) {
e.printStackTrace();
}
}
public List parseEmployee() throws XMLStreamException {
while (reader.hasNext()) {
int event = reader.next();
switch (event) {
case XMLStreamConstants.START_ELEMENT:
if ("employee".equals(reader.getLocalName())) {
currentEmployee = new Employee();
}
if ("staff".equals(reader.getLocalName())) {
employeeList = new ArrayList<>();
}
if ("hiredate".equals(reader.getLocalName())) {
int yearAttr = Integer.parseInt(reader.getAttributeValue(null, "year"));
int monthAttr = Integer.parseInt(reader.getAttributeValue(null, "month"));
int dayAttr = Integer.parseInt(reader.getAttributeValue(null, "day"));
currentEmployee.setHireDay(yearAttr, monthAttr - 1, dayAttr);
}
break;
case XMLStreamConstants.CHARACTERS:
tagContent = reader.getText().trim();
break;
case XMLStreamConstants.ATTRIBUTE:
int count = reader.getAttributeCount();
for (int i = 0; i < count; i++) {
System.out.printf("count is: %d%n", count);
}
break;
case XMLStreamConstants.END_ELEMENT:
switch (reader.getLocalName()) {
case "employee":
employeeList.add(currentEmployee);
break;
case "name":
currentEmployee.setName(tagContent);
break;
case "salary":
currentEmployee.setSalary(Double.parseDouble(tagContent));
break;
}
}
}
return employeeList;
}
}
And some main() test:
public static void main(String[] args) {
long startTime, elapsedTime;
Main main = new Main();
startTime = System.currentTimeMillis();
main.testSaxParser(); // test
elapsedTime = System.currentTimeMillis() - startTime;
System.out.println(String.format("Parsing time is: %d ms%n", elapsedTime / 1000));
startTime = System.currentTimeMillis();
main.testStaxParser(); // test
elapsedTime = System.currentTimeMillis() - startTime;
System.out.println(String.format("Parsing time is: %d ms%n", elapsedTime / 1000));
startTime = System.currentTimeMillis();
main.testDomParser(); // test
elapsedTime = System.currentTimeMillis() - startTime;
System.out.println(String.format("Parsing time is: %d ms%n", elapsedTime / 1000));
}
Output:
Using SAX Parser:
-----------------
Employee { name=Carl Cracker, salary=75000.0, hireDay=Tue Dec 15 00:00:00 EET 1987 }
Employee { name=Harry Hacker, salary=50000.0, hireDay=Sun Oct 01 00:00:00 EET 1989 }
Employee { name=Tony Tester, salary=40000.0, hireDay=Thu Mar 15 00:00:00 EET 1990 }
Parsing time is: 106 ms
Using StAX Parser:
------------------
Employee { name=Carl Cracker, salary=75000.0, hireDay=Tue Dec 15 00:00:00 EET 1987 }
Employee { name=Harry Hacker, salary=50000.0, hireDay=Sun Oct 01 00:00:00 EET 1989 }
Employee { name=Tony Tester, salary=40000.0, hireDay=Thu Mar 15 00:00:00 EET 1990 }
Parsing time is: 5 ms
Using DOM Parser:
-----------------
Employee { name=Carl Cracker, salary=75000.0, hireDay=Tue Dec 15 00:00:00 EET 1987 }
Employee { name=Harry Hacker, salary=50000.0, hireDay=Sun Oct 01 00:00:00 EET 1989 }
Employee { name=Tony Tester, salary=40000.0, hireDay=Thu Mar 15 00:00:00 EET 1990 }
Parsing time is: 13 ms
You can see some glimpse view at there variations.
But at java exist other as JAXB - You need to have xsd schema and accord to this schema you generate classes. After this you this can use unmarchal() to read from xml file:
public class JaxbDemo {
public static void main(String[] args) {
try {
long startTime = System.currentTimeMillis();
// create jaxb and instantiate marshaller
JAXBContext context = JAXBContext.newInstance(Staff.class.getPackage().getName());
FileInputStream in = new FileInputStream(new File(Files.EMPLOYEE_XML.getFilename()));
System.out.println("Output from employee XML file");
Unmarshaller um = context.createUnmarshaller();
Staff staff = (Staff) um.unmarshal(in);
// print employee list
for (Staff.Employee emp : staff.getEmployee()) {
System.out.println(emp);
}
long elapsedTime = System.currentTimeMillis() - startTime;
System.out.println(String.format("Parsing time is: %d ms%n", elapsedTime));
} catch (Exception e) {
e.printStackTrace();
}
}
}
I tried this one approach as before, result is next:
Employee { name='Carl Cracker', salary=75000, hiredate=1987-12-15 } }
Employee { name='Harry Hacker', salary=50000, hiredate=1989-10-1 } }
Employee { name='Tony Tester', salary=40000, hiredate=1990-3-15 } }
Parsing time is: 320 ms
I added another toString(), and it has different hire day format.
Here is few links that is interesting for you: