I have a website that I\'m scraping that has a similar structure the following. I\'d like to be able to grab the info out of the CData block.
I\'m using BeautifulSo
import re from bs4 import BeautifulSoup soup = BeautifulSoup(content) for x in soup.find_all('item'): print re.sub('[\[CDATA\]]', '', x.string)