Finding partial matches in an href tag

℡╲_俬逩灬. 提交于 2019-12-21 06:05:02

问题


I am trying to use Beautiful Soup to find all <a> elements where the href attribute includes a certain string.

An example of the full element is:

<a href="/markets/NZSX/securities/ABA">ABA</a>

I am looking for all elements where href includes "/markets/NZSX/securities/".

I am looking to extract the text from this element. This would be ABA in the example.


回答1:


There are several ways to achieve that. With .find_all():

soup.find_all("a", href=re.compile(r"^/markets/NZSX/securities/"))
soup.find_all("a", href=lambda href: href and href.startswith("/markets/NZSX/securities/"))

Or, with a CSS selector:

soup.select('a[href^="/markets/NZSX/securities/"]')

The above would check for the href to start with /markets/NZSX/securities/. If you want apply the "contains" check instead:

soup.find_all("a", href=re.compile(r"/markets/NZSX/securities/"))
soup.find_all("a", href=lambda href: href and "/markets/NZSX/securities/" in href)
soup.select('a[href*="/markets/NZSX/securities/"]')


来源:https://stackoverflow.com/questions/34759129/finding-partial-matches-in-an-href-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!