Is it possible to scrape a “dynamical webpage” with beautifulsoup?

孤人 提交于 2021-02-08 17:00:51

问题


I am currently begining to use beautifulsoup to scrape websites, I think I got the basics even though I lack theoretical knowledge about webpages, I will do my best to formulate my question.

What I mean with dynamical webpage is the following: a site whose HTML changes based on user action, in my case its collapsible tables.

I want to obtain the data inside some "div" tag but when you load the page, the data seems unavalible in the html code, when you click on the table it expands, and the "class" of this "div" changes from something like "something blabla collapsible" to "something blabla collapsible active" and this I can scrape with my knowledge.

Can I get this data using beautifulsoup? In case I can't, I thought of using something like selenium to click on all the tables and then download the html, which I could scrape, is there an easier way?

Thank you very much.


回答1:


It depends. If the data is already loaded when the page loads, then the data is available to scrape, it's just in a different element, or being hidden. If the click event triggers loading of the data in some way, then no, you will need Selenium or another headless browser to automate this.

Beautiful soup is only an HTML parser, so whatever data you get by requesting the page is the only data that beautiful soup can access.



来源:https://stackoverflow.com/questions/40732906/is-it-possible-to-scrape-a-dynamical-webpage-with-beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!