Adapt an iterator to behave like a file-like object in Python

前端 未结 8 1134
Happy的楠姐
Happy的楠姐 2020-12-13 07:00

I have a generator producing a list of strings. Is there a utility/adapter in Python that could make it look like a file?

For example,

>>> d         


        
相关标签:
8条回答
  • 2020-12-13 07:43

    The "correct" way to do this is inherit from a standard Python io abstract base class. However it doesn't appear that Python allows you to provide a raw text class, and wrap this with a buffered reader of any kind.

    The best class to inherit from is TextIOBase. Here's such an implementation, handling readline, and read while being mindful of performance. (gist)

    import io
    
    class StringIteratorIO(io.TextIOBase):
    
        def __init__(self, iter):
            self._iter = iter
            self._left = ''
    
        def readable(self):
            return True
    
        def _read1(self, n=None):
            while not self._left:
                try:
                    self._left = next(self._iter)
                except StopIteration:
                    break
            ret = self._left[:n]
            self._left = self._left[len(ret):]
            return ret
    
        def read(self, n=None):
            l = []
            if n is None or n < 0:
                while True:
                    m = self._read1()
                    if not m:
                        break
                    l.append(m)
            else:
                while n > 0:
                    m = self._read1(n)
                    if not m:
                        break
                    n -= len(m)
                    l.append(m)
            return ''.join(l)
    
        def readline(self):
            l = []
            while True:
                i = self._left.find('\n')
                if i == -1:
                    l.append(self._left)
                    try:
                        self._left = next(self._iter)
                    except StopIteration:
                        self._left = ''
                        break
                else:
                    l.append(self._left[:i+1])
                    self._left = self._left[i+1:]
                    break
            return ''.join(l)
    
    0 讨论(0)
  • 2020-12-13 07:46

    Looking at Matt's answer, I can see that it's not always necessary to implement all the read methods. read1 may be sufficient, which is described as:

    Read and return up to size bytes, with at most one call to the underlying raw stream’s read()...

    Then it can be wrapped with io.TextIOWrapper which, for instance, has implementation of readline. As an example here's streaming of CSV-file from S3's (Amazon Simple Storage Service) boto.s3.key.Key which implements iterator for reading.

    import io
    import csv
    
    from boto import s3
    
    
    class StringIteratorIO(io.TextIOBase):
    
        def __init__(self, iter):
            self._iterator = iter
            self._buffer = ''
    
        def readable(self):
            return True
    
        def read1(self, n=None):
            while not self._buffer:
                try:
                    self._buffer = next(self._iterator)
                except StopIteration:
                    break
            result = self._buffer[:n]
            self._buffer = self._buffer[len(result):]
            return result
    
    
    conn = s3.connect_to_region('some_aws_region')
    bucket = conn.get_bucket('some_bucket')
    key = bucket.get_key('some.csv')    
    
    fp = io.TextIOWrapper(StringIteratorIO(key))
    reader = csv.DictReader(fp, delimiter = ';')
    for row in reader:
        print(row)
    

    Update

    Here's an answer to related question which looks a little better. It inherits io.RawIOBase and overrides readinto. In Python 3 it's sufficient, so instead of wrapping IterStream in io.BufferedReader one can wrap it in io.TextIOWrapper. In Python 2 read1 is needed but it can be simply expressed though readinto.

    0 讨论(0)
提交回复
热议问题