问题
I want to parse some HTML in order to find the values of some attributes/tags etc.
What HTML parsers do you recommend? Any pros and cons?
回答1:
NekoHTML, TagSoup, and JTidy will allow you to parse HTML and then process with XML tools, like XPath.
回答2:
I have tried HTML Parser which is dead simple.
回答3:
Do you need to do a full parse of the HTML? If you're just looking for specific values within the contents (a specific tag/param), then a simple regular expression might be enough, and could very well be faster.
来源:https://stackoverflow.com/questions/26638/what-html-parsing-libraries-do-you-recommend-in-java