Web scrapping with jsoup and selenium

后端 未结 1 558
萌比男神i
萌比男神i 2020-12-18 12:09

I want to extract some information from this dynamic website with selenium and jsoup. To get the information I want to extract I have to click to the button \"Details öffnen

1条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-18 12:27

    It is true that Jsoup can't handle dynamic content if it is javascript generated, but in your case the button is making an Ajax request and this can be done with Jsoup pretty well.

    I'd suggest to make a call to retieve the buttons and their ids, and then make succesive calls (Ajax posts) to retrieve the details (comments or whatever).

    The code could be:

        Document document = Jsoup.connect("http://www.seminarbewertung.de/seminar-bewertungen?id=3448").get();
        //we retrieve the buttons
        Elements select = document.select("input.rating_expand");
        //we go for the first
        Element element = select.get(0);
        //we pick the id
        String ratingId = element.attr("rating_id");
    
        //the Ajax call
        Document document2 = Jsoup.connect("http://www.seminarbewertung.de/bewertungs-details-abfragen")
                .header("Accept", "*/*")
                .header("X-Requested-With", "XMLHttpRequest")
                .data("rating_id", ratingId)
                .post();
    
        //we find the comment, and we are done
        //note that this selector is only as a demo, feel free to adjust to your needs
        Elements select2 = document2.select("div.ratingbox div.panel-body.text-center");
        //We are done!
        System.out.println(select2.text());
    

    This code will print the desired:

    Das Eingehen auf individuelle Bedürfnisse eines jeden einzelnen Teilnehmer scheint mir ein Markenzeichen von Fromm zu sein. Bei einem früheren Seminar habe ich dies auch schon so erlebt!

    I hope it will help.

    Have a happy new year!

    0 讨论(0)
提交回复
热议问题