Here\'s the deal. Is there a way to have strings tokenized in a line based on multiple regexes?
One example:
I have to get all href tags, their corresponding
Sounds like you really just want to parse HTML, I recommend looking at any of the wonderful packages for doing so:
Or! You can use a parser like one of the following:
This example is from the BeautifulSoup Documentation:
from BeautifulSoup import BeautifulSoup, SoupStrainer
import re
links = SoupStrainer('a')
[tag for tag in BeautifulSoup(doc, parseOnlyThese=links)]
# [success,
# experiments,
# BoogaBooga]
linksToBob = SoupStrainer('a', href=re.compile('bob.com/'))
[tag for tag in BeautifulSoup(doc, parseOnlyThese=linksToBob)]
# [success,
# experiments]