Complex Beautiful Soup query

后端 未结 3 860
盖世英雄少女心
盖世英雄少女心 2020-12-17 06:25

Here is a snippet of an HTML file I\'m exploring with Beautiful Soup.


    

        
3条回答
  •  渐次进展
    2020-12-17 06:50

    BeautifulSoup's search mechanisms accept a callable, which the docs appear to recommend for your case: "If you need to impose complex or interlocking restrictions on a tag's attributes, pass in a callable object for name,...". (ok... they're talking about attributes specifically, but the advice reflects an underlying spirit to the BeautifulSoup API).

    If you want a one-liner:

    soup.findAll(lambda tag: tag.name == 'a' and \
    tag.findParent('strong', 'sans') and \
    tag.findParent('strong', 'sans').findParent('td', attrs={'width':'50%'}))
    

    I've used a lambda in this example, but in practice you may want to define a callable function if you have multiple chained requirements as this lambda has to make two findParent('strong', 'sans') calls to avoid raising an exception if an tag has no strong parent. Using a proper function, you could make the test more efficient.

提交回复
热议问题