Beautiful Soup find() returns None?

我的未来我决定 提交于 2021-01-29 11:00:40

问题


I am trying to parse the HTML on this website.

I would like to get the text from all these span elements with class = "post-subject"

Examples:

<span class="post-subject">Set of 20 moving boxes (20009 or 20011)</span>

<span class="post-subject">Firestick/Old xbox games</span>

When I run my code below, soup.find() returns None. I'm not sure what's going on?

import requests
from bs4 import BeautifulSoup


page = requests.get('https://trashnothing.com/washington-dc-freecycle?page=1')
soup = BeautifulSoup(page.text, 'html.parser')

soup.find('span', {'class': 'post-subject'})

回答1:


To help you get started the following should load the page you will need to get the correct gecko driver and then can implement with Selenium. I do not see a class: post-subject on that page you linked, but you can automate button clicks for the login as :

availbutton = driver.find_element_by_id('buttonAvailability_1')
availbutton.click()


from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Firefox()
driver.get('https://trashnothing.com/washington-dc-freecycle?page=1')

html = driver.page_source
soup = BeautifulSoup(html,'lxml')
print(soup.find('span', {'class': 'post-subject'}))



回答2:


I had the same issue. Just changed the html.parser to html5lib and boom. It was working then. Also its a good practice to use soup.find_all() instead of soup.find() as the function return more than one object



来源:https://stackoverflow.com/questions/51529482/beautiful-soup-find-returns-none

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!