Trying to scrape some HTML from something like this. Sometimes the data I need is in div[0], sometimes div[1], etc.
Imagine everyone takes 3-5 classes. One of them is al
You can extract them searching for any Run it like: That yields:score as class attribute value, and use a regular expression to extract its biology score:
from bs4 import BeautifulSoup
import sys
import re
soup = BeautifulSoup(open(sys.argv[1], 'r'), 'html')
for div in soup.find_all('div', attrs={'class': 'score'}):
t = re.search(r'Biology\s+(\S+)', div.string)
if t: print(t.group(1))
python3 script.py htmlfile
A+
B
B
B
B