Download and unzip file with Python

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 16:02:23

问题


I am trying to download and open a zipped file and seem to be having trouble using a file type handle with zipfile. I'm getting the error "AttributeError: addinfourl instance has no attribute 'seek'" when running this:

import zipfile
import urllib2

def download(url,directory,name):
 webfile = urllib2.urlopen('http://www.sec.gov'+url)
 webfile2 = zipfile.ZipFile(webfile)
 content = zipfile.ZipFile.open(webfile2).read()
 localfile = open(directory+name, 'w')
 localfile.write(content)
 localfile.close()
 return()

download(link.get("href"),'./fails_data', link.text)

回答1:


You can't seek on a urllib2.urlopened file. The methods it supports are listed here: http://docs.python.org/library/urllib.html#urllib.urlopen.

You'll have to retrieve the file (possibly with urllib.urlretrieve, http://docs.python.org/library/urllib.html#urllib.urlretrieve), then use zipfile on it.

Alternatively, you could read() the urlopened file, then put it into a StringIO, then use zipfile on that, if you wanted the zipped data in memory. Also check out the extract and extract_all methods of zipfile if you just want to extract the file, instead of using read.




回答2:


Putting things together, the following retrieves the content of the first file within a zipped file from a website:

import urllib
import zipfile

url = 'http://www.gutenberg.lib.md.us/4/8/8/2/48824/48824-8.zip'
filehandle, _ = urllib.urlretrieve(url)
zip_file_object = zipfile.ZipFile(filehandle, 'r')
first_file = zip_file_object.namelist()[0]
file = zip_file_object.open(first_file)
content = file.read()


来源:https://stackoverflow.com/questions/6861323/download-and-unzip-file-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!