Extracting href with Beautiful Soup

只谈情不闲聊 提交于 2019-12-30 06:39:15

问题


I use this code to get acces to my link :

links = soup.find("span", { "class" : "hsmall" })
links.findNextSiblings('a')
for link in links:
  print link['href']
  print link.string

Link have no ID or class or whatever, it's just a classic link with a href attribute.

The response of my script is :

print link['href']
TypeError: string indices must be integers

Can you help me to get href value ? Thx !


回答1:


Links is still referring to your soup.find. So you could do something like:

links = soup.find("span", { "class" : "hsmall" }).findNextSiblings('a')
for link in links:
    print link['href']
    print link.string



回答2:


Okay, it works now with following code :

linkSpan = soup.find("span", { "class" : "hsmall" })
link = [tag.attrMap['href'] for tag in linkSpan.findAll('a', {'href': True})]
for lien in link:
  print "LINK = " + lien`


来源:https://stackoverflow.com/questions/7183922/extracting-href-with-beautiful-soup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!