how to ignore attribute without quotes in xml

青春壹個敷衍的年華 提交于 2019-12-24 14:23:44

问题


i want to count how many times tag1 occurs givin this 123.xml file ( streaming from the internet)

<startend>

 <tag1 name=myname>
<date>10-10-10</date>
</tag1 >

 <tag1 name=yourname>
   <date>11-10-10</date>
  </tag1 >

 </startend>

using : xmlstarlet sel -t -v "count(//tag1)" 123.xml

output :

AttValue: " or ' expected attributes construct error

how to ignore that the attribute has no " " ?


回答1:


You input XML/HTML structure has invalid tags/attributes and should be recovered beforehand:

xmlstarlet solution:

xmlstarlet fo -o -R -H -D 123.xml 2>/dev/null | xmlstarlet sel -t -v "count(//tag1)" -n

The output:

2

Details:

  • fo (or format) - Format XML document(s)
  • -o or --omit-decl - omit xml declaration
  • -R or --recover - try to recover what is parsable
  • -D or --dropdtd - remove the DOCTYPE of the input docs
  • -H or --html - input is HTML
  • 2>/dev/null - suppress errors/warnings



回答2:


XML always requires quotes around attribute values. If you want to keep using XML, you first must produce valid XML from your input. You could use an SGML processor such as OpenSP (in particular, the osx program) to format your input into wellformed XML. It's as simple as invoking osx <your Input file> on it.

If you're on Ubuntu/Debian Linux, you can install osx by invoking sudo apt-get install opensp on the command line (and similarly on other Unix systems).



来源:https://stackoverflow.com/questions/47284823/how-to-ignore-attribute-without-quotes-in-xml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!