Downloading and unzipping a .zip file without writing to disk

前端 未结 9 1582
囚心锁ツ
囚心锁ツ 2020-12-02 04:42

I have managed to get my first python script to work which downloads a list of .ZIP files from a URL and then proceeds to extract the ZIP files and writes them to disk.

9条回答
  •  悲哀的现实
    2020-12-02 05:15

    Vishal's example, however great, confuses when it comes to the file name, and I do not see the merit of redefing 'zipfile'.

    Here is my example that downloads a zip that contains some files, one of which is a csv file that I subsequently read into a pandas DataFrame:

    from StringIO import StringIO
    from zipfile import ZipFile
    from urllib import urlopen
    import pandas
    
    url = urlopen("https://www.federalreserve.gov/apps/mdrm/pdf/MDRM.zip")
    zf = ZipFile(StringIO(url.read()))
    for item in zf.namelist():
        print("File in zip: "+  item)
    # find the first matching csv file in the zip:
    match = [s for s in zf.namelist() if ".csv" in s][0]
    # the first line of the file contains a string - that line shall de ignored, hence skiprows
    df = pandas.read_csv(zf.open(match), low_memory=False, skiprows=[0])
    

    (Note, I use Python 2.7.13)

    This is the exact solution that worked for me. I just tweaked it a little bit for Python 3 version by removing StringIO and adding IO library

    Python 3 Version

    from io import BytesIO
    from zipfile import ZipFile
    import pandas
    import requests
    
    url = "https://www.nseindia.com/content/indices/mcwb_jun19.zip"
    content = requests.get(url)
    zf = ZipFile(BytesIO(content.content))
    
    for item in zf.namelist():
        print("File in zip: "+  item)
    
    # find the first matching csv file in the zip:
    match = [s for s in zf.namelist() if ".csv" in s][0]
    # the first line of the file contains a string - that line shall de     ignored, hence skiprows
    df = pandas.read_csv(zf.open(match), low_memory=False, skiprows=[0])
    

提交回复
热议问题