Basically, I want to use BeautifulSoup to grab strictly the visible text on a webpage. For instance, this webpage is my test case. And I mainly want to just get the
The simplest way to handle this case is by using getattr()
. You can adapt this example to your needs:
from bs4 import BeautifulSoup
source_html = """
"""
soup = BeautifulSoup(source_html, "lxml")
my_ratings = getattr(soup.find('span', {"class": "ratingsContent"}), "text", None)
print(my_ratings)
This will find the text element,"3.7"
, within the tag object when it exists, however, default to
NoneType
when it does not.
getattr(object, name[, default])
Return the value of the named attribute of object. name must be a string. If the string is the name of one of the object’s attributes, the result is the value of that attribute. For example, getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise, AttributeError is raised.