Get contents by class names using Beautiful Soup

后端 未结 6 516
有刺的猬
有刺的猬 2020-12-28 19:43

Using Beautiful Soup module, how can I get data of a div tag whose class name is feeditemcontent cxfeeditemcontent? Is it:

soup.cla         


        
6条回答
  •  既然无缘
    2020-12-28 19:58

    Check this bug report: https://bugs.launchpad.net/beautifulsoup/+bug/410304

    As you can see, Beautiful soup can not really understand class="a b" as two classes a and b.

    However, as it appears in the first comment there, a simple regexp should suffice. In your case:

    soup = BeautifulSoup(html_doc)
    for x in soup.findAll("div",{"class":re.compile(r"\bfeeditemcontent\b")}):
        print "result: ",x
    

    Note: That has been fixed in the recent beta. I haven't gone through the docs of the recent versions, may be you could do that. Or if you want to get it working using the older version, you could use the above.

提交回复
热议问题