Parsing extremely large XML files in php

前端 未结 2 959
花落未央
花落未央 2020-12-03 15:33

I need to parse XML files of 40GB in size, and then normalize, and insert to a MySQL database. How much of the file I need to store in the database is not clear, neither do

2条回答
  •  长情又很酷
    2020-12-03 16:01

    It would be nice to know what you actually intend to do with the XML. The way you parse it depends very much on the processing you need to carry out, as well as the size.

    If this is a one-off task, then I've started in the past by discovering the XML structure before doing anything else. My DTDGenerator (see saxon.sf.net) was written for this purpose a long time ago and still does the job, there are other tools available now but I don't know whether they do streamed processing which is a prerequisite here.

    You can write an application that processes the data using either a pull or push streamed parser (SAX or StAX). How easy this is depends on how much processing you have to do and how much state you have to maintain, which you haven't told us. Alternatively you could try streamed XSLT processing, which is available in Saxon-EE.

提交回复
热议问题