Parallel XML Parsing in Java

前端 未结 3 1474

I\'m writing an application which processes a lot of xml files (>1000) with deep node structures. It takes about six seconds with with woodstox (Event API) to parse a file w

3条回答
  •  失恋的感觉
    2021-01-02 04:03

    I am agree with Jim. I think that if you want to improve performance of overall processing of 1000 files your plan is good except #3 that is irrelevant in this case. If however you want to improve performance of parsing of single file you have a problem. I do not know how it is possible to split XML file without it parsing. Each chunk will be illegal XML and your parser will fail.

    I believe that improving overall time is good enough for you. In this case read this tutorial: http://download.oracle.com/javase/tutorial/essential/concurrency/index.html then create thread pool of for example 100 threads and queue that contains XML sources. Each thread will parse only 10 files that will bring serious performance benefit in multi-CPU environment.

提交回复
热议问题