Download Returned Zip file from URL

余生长醉 提交于 2019-11-27 06:22:52
senderle

Use urllib2.urlopen. The return value is a file-like object that you can read(), pass to zipfile and so on.

yoavram

As far as I can tell, the proper way to do this is:

import requests, zipfile, StringIO
r = requests.get(zip_file_url, stream=True)
z = zipfile.ZipFile(StringIO.StringIO(r.content))
z.extractall()

of course you'd want to check that the GET was successful with r.ok.

For python 3+, sub the StringIO module with the io module and use BytesIO instead of StringIO: Here are release notes that mention this change.

import requests, zipfile, io
r = requests.get(zip_file_url)
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall()

Here's what I got to work in Python 3:

import zipfile, urllib.request, shutil

url = 'http://www....myzipfile.zip'
file_name = 'myzip.zip'

with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)
    with zipfile.ZipFile(file_name) as zf:
        zf.extractall()

With the help of this blog post, I've got it working with just requests. The point of the weird stream thing is so we don't need to call content on large requests, which would require it to all be processed at once, clogging the memory. The stream avoids this by iterating through the data one chunk at a time.

url = 'https://www2.census.gov/geo/tiger/GENZ2017/shp/cb_2017_02_tract_500k.zip'
target_path = 'alaska.zip'

response = requests.get(url, stream=True)
handle = open(target_path, "wb")
for chunk in response.iter_content(chunk_size=512):
    if chunk:  # filter out keep-alive new chunks
        handle.write(chunk)
handle.close()

Either use urllib2.urlopen, or you could try using the excellent Requests module and avoid urllib2 headaches:

import requests
results = requests.get('url')
#pass results.content onto secondary processing...

Thanks to @yoavram for the above solution, my url path linked to a zipped folder, and encounter an error of BADZipfile (file is not a zip file), and it was strange if I tried several times it retrieve the url and unzipped it all of sudden so I amend the solution a little bit. using the is_zipfile method as per here

r = requests.get(url, stream =True)
check = zipfile.is_zipfile(io.BytesIO(r.content))
while not check:
    r = requests.get(url, stream =True)
    check = zipfile.is_zipfile(io.BytesIO(r.content))
else:
    z = zipfile.ZipFile(io.BytesIO(r.content))
    z.extractall()

I came here searching how to save a .bzip2 file. Let me paste the code for others who might come looking for this.

url = "http://api.mywebsite.com"
filename = "swateek.tar.gz"

response = requests.get(url, headers=headers, auth=('myusername', 'mypassword'), timeout=50)
if response.status_code == 200:
with open(filename, 'wb') as f:
   f.write(response.content)

I just wanted to save the file as is.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!