xmllint failing to properly query with xpath

妖精的绣舞 提交于 2019-11-26 02:37:28

问题


I\'m trying to query an xml file generated by adium. xmlwf says that it\'s well formed. By using xmllint\'s debug option i get the following:

$ xmllint --debug doc.xml
DOCUMENT
version=1.0
encoding=UTF-8
URL=doc.xml
standalone=true
  ELEMENT chat
    default namespace href=http://purl.org/net/ulf/ns/0.4-02
    ATTRIBUTE account
      TEXT
        content=foo@bar.com
    ATTRIBUTE service
      TEXT compact
        content=MSN
    TEXT compact
      content= 
    ELEMENT event
      ATTRIBUTE type

Everything seems to parse just fine. However, when I try to query even the simplest things, I don\'t get anything:

$ xmllint --xpath \'/chat\' doc.xml 
XPath set is empty

What\'s happening? Running that exact same query using xpath returns the correct results (however with no newline between results). Am I doing something wrong or is xmllint just not working properly?

Here\'s a shorter, anonymized version of the xml that shows the same behavior:

<?xml version=\"1.0\" encoding=\"UTF-8\" ?>
<chat xmlns=\"http://purl.org/net/ulf/ns/0.4-02\" account=\"foo@bar.com\" service=\"MSN\">
<event type=\"windowOpened\" sender=\"foo@bar.com\" time=\"2011-11-22T00:34:43-03:00\"></event>
<message sender=\"foo@bar.com\" time=\"2011-11-22T00:34:43-03:00\" alias=\"foo\"><div><span style=\"color: #000000; font-family: Helvetica; font-size: 12pt;\">hi</span></div></message>
</chat>

回答1:


I don't use xmllint, but I think the reason your XPath isn't working is because your doc.xml file is using a default namespace (http://purl.org/net/ulf/ns/0.4-02).

From what I can see, you have 2 options.

A. Use xmllint in shell mode and declare the namespace with a prefix. You can then use that prefix in your XPath.

    xmllint --shell doc.xml
    / > setns x=http://purl.org/net/ulf/ns/0.4-02
    / > xpath /x:chat

B. Use local-name() to match element names.

    xmllint --xpath /*[local-name()='chat']

You may also want to use namespace-uri()='http://purl.org/net/ulf/ns/0.4-02' along with local-name() so you are sure to return exactly what you are intending to return.




回答2:


I realize this question is very old now, but in case it helps someone...

Had the same problem and it was due to the XML having a namespace (and sometimes it was duplicated in various places in the XML). Found it easiest to just remove the namespace before using xmllint:

sed -e 's/xmlns=".*"//g' file.xml | xmllint --xpath "..." -

In my case the XML was UTF-16 so I had to convert to UTF-8 first (for sed):

iconv -f utf16 -t utf8 file.xml | sed -e 's/encoding="UTF-16"?>/encoding="UTF-8"?>/' | sed -e 's/xmlns=".*"//g' | xmllint --xpath "..." -


来源:https://stackoverflow.com/questions/8264134/xmllint-failing-to-properly-query-with-xpath

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!