Convert gzipped data fetched by urllib2 to HTML

后端 未结 2 1883
鱼传尺愫
鱼传尺愫 2020-12-05 22:22

I currently use mechanize to read gzipped web page as below:

br = mechanize.Browser()
br.set_handle_gzip(True)
response = br.open(url)
data = response.read()         


        
相关标签:
2条回答
  • 2020-12-05 22:24

    Try this:

    import StringIO
    data = StringIO.StringIO(data)
    import gzip
    gzipper = gzip.GzipFile(fileobj=data)
    html = gzipper.read()
    

    html should now hold the HTML (Print it to see). See here for more info.

    0 讨论(0)
  • 2020-12-05 22:34
    def ungzip(r,b):
        headers = r.info()
        if ('Content-Encoding' in headers.keys() and headers['Content-Encoding']=='gzip') or \
           ('content-encoding' in headers.keys() and headers['content-encoding']=='gzip'):
            import gzip
            gz = gzip.GzipFile(fileobj=r, mode='rb')
            html = gz.read()
            gz.close()
            headers['Content-type'] = 'text/html; charset=utf-8'
            r.set_data(html)
            b.set_response(r)
    
    0 讨论(0)
提交回复
热议问题