BeautifulSoup Grab Visible Webpage Text

前端 未结 10 759
北恋
北恋 2020-11-22 07:35

Basically, I want to use BeautifulSoup to grab strictly the visible text on a webpage. For instance, this webpage is my test case. And I mainly want to just get the

10条回答
  •  孤独总比滥情好
    2020-11-22 08:17

    The simplest way to handle this case is by using getattr(). You can adapt this example to your needs:

    from bs4 import BeautifulSoup
    
    source_html = """
    
        
            3.7
        
    
    """
    
    soup = BeautifulSoup(source_html, "lxml")
    my_ratings = getattr(soup.find('span', {"class": "ratingsContent"}), "text", None)
    print(my_ratings)
    

    This will find the text element,"3.7", within the tag object 3.7 when it exists, however, default to NoneType when it does not.

    getattr(object, name[, default])

    Return the value of the named attribute of object. name must be a string. If the string is the name of one of the object’s attributes, the result is the value of that attribute. For example, getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise, AttributeError is raised.

提交回复
热议问题