Convert xPath to JSoup query

匿名 (未验证) 提交于 2019-12-03 02:15:02

问题:

Does anyone know of an xPath to JSoup convertor? I get the following xPath from Chrome:

 //*[@id="docs"]/div[1]/h4/a 

and would like to change it into a Jsoup query. The path contains an href I'm trying to reference.

回答1:

This is very easy to convert manually.

Something like this (not tested)

document.select("#docs > div:eq(1) > h4 > a").attr("href"); 

Documentation:

http://jsoup.org/cookbook/extracting-data/selector-syntax


Related question from comment

Trying to get the href for the first result here: cbssports.com/info/search#q=fantasy%20tom%20brady

Code

Elements select = Jsoup.connect("http://solr.cbssports.com/solr/select/?q=fantasy%20tom%20brady")         .get()         .select("response > result > doc > str[name=url]");  for (Element element : select) {     System.out.println(element.html()); } 

Result

http://fantasynews.cbssports.com/fantasyfootball/players/playerpage/187741/tom-brady http://www.cbssports.com/nfl/players/playerpage/187741/tom-brady http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1825265/brady-lisoski http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1766777/blake-brady http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1851211/brady-foltz http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1860955/brady-earnhardt http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1673397/brady-amack 

Screenshot from Developer Console - grabbing urls



回答2:

I am using Google Chrome Version 47.0.2526.73 m (64-bit) and I can now directly copy the Selector path which is compatible with JSoup



Copied Selector of the element in the screenshot span.com is
#question > table > tbody > tr:nth-child(1) > td.postcell > div > div.post-text > pre > code > span.com



回答3:

I have tested the following XPath and Jsoup, it works.

example 1:

[XPath]

//*[@id="docs"]/div[1]/h4/a 

[JSoup]

document.select("#docs > div > h4 > a").attr("href"); 

example 2:

[XPath]

//*[@id="action-bar-container"]/div/div[2]/a[2] 

[JSoup]

document.select("#action-bar-container > div > div:eq(1) > a:eq(1)").attr("href");  


回答4:

Here is the working standalone snippet using Xsoup with Jsoup:

import java.util.List;  import org.jsoup.Jsoup; import org.jsoup.nodes.Document;  import us.codecraft.xsoup.Xsoup;  public class TestXsoup {     public static void main(String[] args){              String html = "<html><div><a href='https://github.com'>github.com</a></div>" +                     "<table><tr><td>a</td><td>b</td></tr></table></html>";              Document document = Jsoup.parse(html);              List<String> filasFiltradas = Xsoup.compile("//tr/td/text()").evaluate(document).list();             System.out.println(filasFiltradas);      } } 

Output:

[a, b] 

Libraries included:

xsoup-0.3.1.jar jsoup-1.103.jar



回答5:

Depends what you want.

Document doc = JSoup.parse(googleURL); doc.select("cite") //to get all the cite elements in the page  doc.select("li > cite") //to get all the <cites>'s that only exist under the <li>'s  doc.select("li.g cite") //to only get the <cite> tags under <li class=g> tags   public static void main(String[] args) throws IOException {     String html = getHTML();     Document doc = Jsoup.parse(html);     Elements elems = doc.select("li.g > cite");     for(Element elem: elems){         System.out.println(elem.toString());     } } 


回答6:

You don't necessarily need to convert Xpath to JSoup specific selectors.

Instead you can use XSoup which is based on JSoup and supports Xpath.

https://github.com/code4craft/xsoup

Here is an example using XSoup from the docs.

@Test public void testSelect() {      String html = "<html><div><a href='https://github.com'>github.com</a></div>" +             "<table><tr><td>a</td><td>b</td></tr></table></html>";      Document document = Jsoup.parse(html);      String result = Xsoup.compile("//a/@href").evaluate(document).get();     Assert.assertEquals("https://github.com", result);      List<String> list = Xsoup.compile("//tr/td/text()").evaluate(document).list();     Assert.assertEquals("a", list.get(0));     Assert.assertEquals("b", list.get(1)); } 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!