Reading Huge XML File using StAX and XPath

前端 未结 7 2062
轮回少年
轮回少年 2020-12-31 11:35

The input file contains thousands of transactions in XML format which is around 10GB of size. The requirement is to pick each transaction XML based on the user input and sen

7条回答
  •  滥情空心
    2020-12-31 12:19

    A fun solution for processing huge XML files >10GB.

    1. Use ANTLR to create byte offsets for the parts of interest. This will save some memory compared with a DOM based approach.
    2. Use Jaxb to read parts from byte position

    Find details at the example of wikipedia dumps (17GB) in this SO answer https://stackoverflow.com/a/43367629/1485527

提交回复
热议问题