问题
i want to count how many times tag1 occurs givin this 123.xml file ( streaming from the internet)
<startend>
<tag1 name=myname>
<date>10-10-10</date>
</tag1 >
<tag1 name=yourname>
<date>11-10-10</date>
</tag1 >
</startend>
using : xmlstarlet sel -t -v "count(//tag1)" 123.xml
output :
AttValue: " or ' expected attributes construct error
how to ignore that the attribute has no " " ?
回答1:
You input XML/HTML structure has invalid tags/attributes and should be recovered beforehand:
xmlstarlet
solution:
xmlstarlet fo -o -R -H -D 123.xml 2>/dev/null | xmlstarlet sel -t -v "count(//tag1)" -n
The output:
2
Details:
fo (or format)
- Format XML document(s)-o or --omit-decl
- omit xml declaration-R or --recover
- try to recover what is parsable-D or --dropdtd
- remove the DOCTYPE of the input docs-H or --html
- input is HTML2>/dev/null
- suppress errors/warnings
回答2:
XML always requires quotes around attribute values. If you want to keep using XML, you first must produce valid XML from your input. You could use an SGML processor such as OpenSP (in particular, the osx
program) to format your input into wellformed XML. It's as simple as invoking osx <your Input file>
on it.
If you're on Ubuntu/Debian Linux, you can install osx
by invoking sudo apt-get install opensp
on the command line (and similarly on other Unix systems).
来源:https://stackoverflow.com/questions/47284823/how-to-ignore-attribute-without-quotes-in-xml