Extract title with BeautifulSoup

我是研究僧i 提交于 2020-08-27 05:54:31

问题


I have this

from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]

from bs4 import BeautifulSoup
raw = BeautifulSoup(html, 'html.parser').get_text()
raw.find_all('title', limit=1)
print (raw.find_all("title"))
'<!doctype html public "-//W3C//DTD HTML 4.0 Transitional//EN'

I want to extract the title of the page using BeautifulSoup but getting this error

Traceback (most recent call last):
  File "C:\Users\Passanova\AppData\Local\Programs\Python\Python35-32\test.py", line 8, in <module>
    raw.find_all('title', limit=1)
AttributeError: 'str' object has no attribute 'find_all'

Please any suggestions


回答1:


To navigate the soup, you need a BeautifulSoup object, not a string. So remove your get_text() call to the soup.

Moreover, you can replace raw.find_all('title', limit=1) with find('title') which is equivalent.

Try this :

from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.find('title')

print(title) # Prints the tag
print(title.string) # Prints the tag string content



回答2:


You can directly use "soup.title" instead of "soup.find_all('title', limit=1)" or "soup.find('title')" and it'll give you the title.

from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.title
print(title)
print(title.string)



回答3:


Make it simple as that:

soup = BeautifulSoup(htmlString, 'html.parser')
title = soup.title.text

Here, soup.title returns a BeautifulSoup element which is the title element.



来源:https://stackoverflow.com/questions/35956045/extract-title-with-beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!