Get value of span tag using BeautifulSoup

断了今生、忘了曾经 提交于 2020-05-26 19:53:51

问题


I have a number of facebook groups that I would like to get the count of the members of. An example would be this group: https://www.facebook.com/groups/347805588637627/ I have looked at inspect element on the page and it is stored like so:

<span id="count_text">9,413 members</span>

I am trying to get "9,413 members" out of the page. I have tried using BeautifulSoup but cannot work it out.

Thanks

Edit:

from bs4 import BeautifulSoup
import requests

url = "https://www.facebook.com/groups/347805588637627/"
r  = requests.get(url)
data = r.text
soup = BeautifulSoup(data, "html.parser")
span = soup.find("span", id="count_text")
print(span.text)

回答1:


In case there is more than one span tag in the page:

from bs4 import BeautifulSoup
soup = BeautifulSoup(your_html_input, 'html.parser')
span = soup.find("span", id="count_text")
span.text



回答2:


You can use the text attribute of the parsed span:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<span id="count_text">9,413 members</span>',   'html.parser')
>>> soup.span
<span id="count_text">9,413 members</span> 
>>> soup.span.text
'9,413 members'



回答3:


Facebook uses javascrypt to prevent bots from scraping. You need to use selenium to extract data on python.




回答4:


If you have more than one span tag you can try this

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')

tags = soup('span')

for tag in tags:
  print(tag.contents[0])


来源:https://stackoverflow.com/questions/42175190/get-value-of-span-tag-using-beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!