How can I retrieve the page title of a webpage using Python?

后端 未结 11 1788
南笙
南笙 2020-12-07 08:55

How can I retrieve the page title of a webpage (title html tag) using Python?

11条回答
  •  广开言路
    2020-12-07 09:28

    Using HTMLParser:

    from urllib.request import urlopen
    from html.parser import HTMLParser
    
    
    class TitleParser(HTMLParser):
        def __init__(self):
            HTMLParser.__init__(self)
            self.match = False
            self.title = ''
    
        def handle_starttag(self, tag, attributes):
            self.match = tag == 'title'
    
        def handle_data(self, data):
            if self.match:
                self.title = data
                self.match = False
    
    url = "http://example.com/"
    html_string = str(urlopen(url).read())
    
    parser = TitleParser()
    parser.feed(html_string)
    print(parser.title)  # prints: Example Domain
    

提交回复
热议问题