How to extract absolute URL from relative HTML links using Jsoup?

后端 未结 2 931
野的像风
野的像风 2020-12-05 18:36

I am using Jsoup to extract URL of an webpage. The href attribute of those URL\'s are relative like:

example
<         


        
相关标签:
2条回答
  • 2020-12-05 18:56

    String url = dl.select("a").absUrl("href");

    Is not correct because dl.select("a") will not return a single item but a collection. You need to get elements by index

    eg :

    Elements elems = dl.select("a");
    Element a1 = elems.get(0); //0 is the index first element increasing to (elems.size()-1)
    now you can do
    a1.absUrl("href");
    

    If you are sure only one item will result from the select above, or that the item you want will be the first, you can:

    String url = dl.select("a").get(0).absUrl("href"); 
    

    Which is also same as

    String url = dl.select("a").first().absUrl("href");
    

    It doesn't have to be the first element anyway, you can always replace the 0 in String url = dl.select("a").get(0).absUrl("href"); with the index of your element. Or use a select that is more specific that will only result in one element.

    0 讨论(0)
  • 2020-12-05 19:08

    You need Element#absUrl().

    String url = dl.select("a").absUrl("href");
    

    You can by the way shorten the select:

    Document document = Jsoup.connect(url).get();
    Elements links = document.select("div.results dl a");
    for (Element link : links) {
        String url = link.absUrl("href");
    }
    
    0 讨论(0)
提交回复
热议问题