Alternative of urllib.urlretrieve in Python 3.5

随声附和 提交于 2019-12-02 04:54:01

问题


I am currently doing a course on machine learning in UDACITY . In there they have written some code in python 2.7 but as i am currently using python 3.5 , i am getting some error . This is the code

import urllib
url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
urllib.urlretrieve(url, filename="../enron_mail_20150507.tgz")
print ("download complete!") 

I tried urllib.request .

  import urllib
  url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
  urllib.request(url, filename="../enron_mail_20150507.tgz")
  print ("download complete!")

But still gives me error .

urllib.request(url, filename="../enron_mail_20150507.tgz")
TypeError: 'module' object is not callable

I am using PyCharm as my IDE .


回答1:


You'd use urllib.request.urlretrieve. Note that this function "may become deprecated at some point in the future", so you might be better off using the less likely to be deprecated interface:

# Adapted from the source:
# https://hg.python.org/cpython/file/3.5/Lib/urllib/request.py#l170
with open(filename, 'wb') as out_file:
    with contextlib.closing(urllib.request.urlopen(url)) as fp:
        block_size = 1024 * 8
        while True:
            block = fp.read(block_size)
            if not block:
                break
            out_file.write(block)

For small enough files, you could just read and write the whole thing and drop the loop entirely.




回答2:


I know this question has long been answered but I'll contribute for any future viewer.

The proposed solution is good but the main issue if that it can generate empty files if you are using invalid urls.

As a workaround to this problem here is how I adapted the code:

def getfile(url,filename,timeout=45):
    with contextlib.closing(urlopen(url,timeout=timeout)) as fp:
        block_size = 1024 * 8
        block = fp.read(block_size)
        if block:
            with open(filename,'wb') as out_file:
                out_file.write(block)
                while True:
                    block = fp.read(block_size)
                    if not block:
                        break
                    out_file.write(block)
        else:
            raise Exception ('nonexisting file or connection error')

I hope this help.



来源:https://stackoverflow.com/questions/38358521/alternative-of-urllib-urlretrieve-in-python-3-5

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!