Trying to parse html hidden by javascript

后端 未结 2 960
攒了一身酷
攒了一身酷 2020-12-12 01:35

I\'ve created a simple java script that used Jsoup to parse a page of data. The site creators have changed the page however, so much that if there is a certain amount of dat

相关标签:
2条回答
  • 2020-12-12 01:57

    Use firefox's or chrome's developer tools. When you click on the link, there is propably an ajax call firing. On the network tab, you can see which url the javascript actually requests and how the result is structured (propably json). Then you can directly access that url to load the rest of the results.

    Or something along those lines ^^

    0 讨论(0)
  • 2020-12-12 02:08

    Try to use something that drives a web browser like Selenium. That's the only one I have used, never needed anything else. I'm sure there are different ones that may suit you better, you should test a few, or not.. Once you get the javascript elements with selenium (or whatever web driver you choose) parse them into JSoup Elements. This way you wouldn't have to completely change libs, but just add one.

    Also, there are ways you can work around javascript by watching what changes in browser's address bar.

    0 讨论(0)
提交回复
热议问题