问题
I'm looking for a simple regular expression (I think), that would return all html tags not having a "name" attribute, but my weak regexp skills won't help me much.
Finding a html tag is not a problem, but the "which does not contain" is. I simply have no idea (well I had, but none of them work).
Any clue?
回答1:
First of all, you should not use regex for this task. An HTML parser surely exists in whatever language you are using and is way better suited for this.
Now, if you need to use regex for whatever reason, you could use a negative lookahead if your implementation supports it. The expression
<\w+(?![^>]*\bname\b)
identifies an opening HTML tag by <\w+
and matches this only if the string "name" (enclosed by word boundaries) does not appear before the next closing bracket.
See it in action with RegExr.
This works only on well behaved HTML, and expanding it to respect quoted strings, javascript or comments will either be impossible or very very ugly. Did I mention HTML parsers? =)
来源:https://stackoverflow.com/questions/7765406/regexp-does-not-contain-attribute-in-html