Library to query HTML with XPath in Java?

后端 未结 5 417
失恋的感觉
失恋的感觉 2020-12-03 16:31

Can anyone recommend me a java library to allow me XPath Queries over URLs? I\'ve tried JAXP without success.

Thank you.

5条回答
  •  暖寄归人
    2020-12-03 17:06

    There are several different approaches to this documented on the Web:

    Using HtmlCleaner

    • HtmlCleaner / Java DOM parser - Using XPath Contains against HTML in Java (This is the way I recommend)
    • HtmlCleaner itself has a built in utility supporting XPath - See the javadocs http://htmlcleaner.sourceforge.net/doc/org/htmlcleaner/XPather.html or this example http://thinkandroid.wordpress.com/2010/01/05/using-xpath-and-html-cleaner-to-parse-html-xml/

    Using Jericho

    • Jericho and Jaxen http://sujitpal.blogspot.com/2009/04/xpath-over-html-using-jericho-and-jaxen.html

    I have tried a few different variations of these approaches, i.e. HtmlParser plus the Java DOM parser, and JSoup plus Jaxen, but the combination that worked best is HtmlCleaner plus the Java DOM parser. The next best combination was Jericho plus Jaxen.

提交回复
热议问题