Beautiful Soup find() returns None?

问题

I am trying to parse the HTML on this website.

I would like to get the text from all these span elements with class = "post-subject"

Examples:

<span class="post-subject">Set of 20 moving boxes (20009 or 20011)</span>

<span class="post-subject">Firestick/Old xbox games</span>

When I run my code below, soup.find() returns None. I'm not sure what's going on?

import requests
from bs4 import BeautifulSoup


page = requests.get('https://trashnothing.com/washington-dc-freecycle?page=1')
soup = BeautifulSoup(page.text, 'html.parser')

soup.find('span', {'class': 'post-subject'})

回答1:

To help you get started the following should load the page you will need to get the correct gecko driver and then can implement with Selenium. I do not see a class: post-subject on that page you linked, but you can automate button clicks for the login as :

availbutton = driver.find_element_by_id('buttonAvailability_1')
availbutton.click()

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Firefox()
driver.get('https://trashnothing.com/washington-dc-freecycle?page=1')

html = driver.page_source
soup = BeautifulSoup(html,'lxml')
print(soup.find('span', {'class': 'post-subject'}))

回答2:

I had the same issue. Just changed the html.parser to html5lib and boom. It was working then. Also its a good practice to use soup.find_all() instead of soup.find() as the function return more than one object

来源：https://stackoverflow.com/questions/51529482/beautiful-soup-find-returns-none

标签

python

web-scraping

beautifulsoup