Read .csv file from URL into Python 3.x - _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

后端 未结 4 1793
傲寒
傲寒 2020-12-01 03:28

I\'ve been struggling with this simple problem for too long, so I thought I\'d ask for help. I am trying to read a list of journal articles from National Library of Medicine

4条回答
  •  一整个雨季
    2020-12-01 03:58

    urlopen will return a urllib.response.addinfourl instance for an ftp request.

    For ftp, file, and data urls and requests explicity handled by legacy URLopener and FancyURLopener classes, this function returns a urllib.response.addinfourl object which can work as context manager...

    >>> urllib2.urlopen(url)
    >>
    

    At this point ftpstream is a file like object, using .read() would return the contents however csv.reader requires an iterable in this case:

    Defining a generator like so:

    def to_lines(f):
        line = f.readline()
        while line:
            yield line
            line = f.readline()
    

    We can create our csv reader like so:

    reader = csv.reader(to_lines(ftps))
    

    And with a url

    url = "http://pic.dhe.ibm.com/infocenter/tivihelp/v41r1/topic/com.ibm.ismsaas.doc/reference/CIsImportMinimumSample.csv"
    

    The code:

    for row in reader: print row
    

    Prints

    >>> 
    ['simpleci']
    ['SCI.APPSERVER']
    ['SRM_SaaS_ES', 'MXCIImport', 'AddChange', 'EN']
    ['CI_CINUM']
    ['unique_identifier1']
    ['unique_identifier2']
    

提交回复
热议问题