问题
I have this code
<a title="Next Page - Results 1 to 60 " href="bla bla" class="smallfont" rel="next">></a>
I want to grab the a element and get the href .
how can I match the title attribute with Next Page
I want to partially match the text in title attribute of the a element.
There are many a tags on the page similar to it but only difference is that the title attribute contains "Next Page or the text is >.
回答1:
You would have to use Regex for accomplishing what you want.
First take the entire markup as a string and make a BeautifulSoup object with it.
Then use the .findAll method of the BeautifulSoup object as follows
import BeautifulSoup
import re
soup = BeautifulSoup('<a title="Next Page - Results 1 to 60 " href="bla bla" class="smallfont" rel="next">></a>')
elements = soup.findAll('a', {'title':re.compile('Next Page.')})
# get all 'a' elements with 'title' attribute as 'Next Page something' into a list
for e in elements:
if str(e.string) == '>' or e.string == '>': # check if string inside 'a' tag is '>'
print e['href']
来源:https://stackoverflow.com/questions/14064186/how-can-i-grab-the-element-by-matching-text-in-its-attribute-in-beautifulsoup