xmlstarlet sel on large file

和自甴很熟 提交于 2019-12-18 04:14:35

问题


The command

$ xmlstarlet sel -t -c "/collection/record" file.xml

seems to load the whole file into memory, before applying the given Xpath expression. This is not usable for large XML files.

Does xmlstarlet provide a streaming mode to extract subelements from a large (100G+) XML file?


回答1:


Since I only needed a tiny subset of XPath for large XML files, I actually implemented a little tool myself: xmlcutty.

The example from my question could be written like this:

$ xmlcutty -path /collection/record file.xml



回答2:


Xmlstarlet translates all (or most) operations into xslt transformations, so the short answer is no.

You could try to use stx, which is streaming transformation language similar to xslt. On the other hand, just coding something together in python using sax or iterparse may be easier and faster (wrt time needed to create code) if you don't care about xml that much.



来源:https://stackoverflow.com/questions/33653844/xmlstarlet-sel-on-large-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!