I would like to parse the HD price from the following snipper of HTML. I am only have fragments of the html code, so I cannot use an HTML parser for this.
The current BeautifulSoup answers only show how to grab all tags. This is better:
from bs4 import BeautifulSoup
soup = """
View In iTunes
£19.99
- HD Version
"""
for HD_Version in (tag for tag in soup('li') if tag.text.lower() == 'hd version'):
price = HD_Version.parent.findPreviousSibling('span', attrs={'class':'price'}).text
In general, using regular expressions to parse an irregular language like HTML is asking for trouble. Stick with an established parser.