Jsoup Delay due to Streaming Website

不羁岁月 提交于 2019-12-22 05:09:33

问题


A question regarding Jsoup: I am building a tool that fetches prices from a website. However, this website has streaming content. If I browse manually, I see the prices of 20 mins ago and have to wait about 3 secs to get the current price. Is there any way I can make some kind of delay in Jsoup to be able to obtain the prices in the streaming section? I am using this code:

conn = Jsoup.connect(link).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36");

conn.timeout(5000);

doc = conn.get();

回答1:


You can Use a JavaFX WebView with javascript enabled. After waiting the two seconds, you can extract the contents and pass them to JSoup.

(After loading your url into your WebView using the example above)
String text=view.getEngine() executeScript("document.documentElement.outerHTML");
Document doc = Jsoup.parse(html);



回答2:


As mentioned in the comments, the site is most likely using some type of scripting that just won't work with Jsoup. Since Jsoup just get the initial HTML response and does not execute any javascript.

I wanted to give you some more guidence though on where to go now. The best bet, in these cases, is to move to another platform for these types of sites. You can migrate to HTMLUnit which is a headless browser, or Selenium which can use HTMLUnit or a real browser like Firefox or Chrome. I would recommend Selenium if you think you will ever need to move past HTMLUnit as HTMLUnit can sometimes be less stable a browser compared to consumer browsers Selenium can support. You can use Selenium with the HTMLUnit driver giving you the option to move to another browser seamlessly later.



来源:https://stackoverflow.com/questions/19432242/jsoup-delay-due-to-streaming-website

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!