How to pipe binary data into numpy arrays without tmp storage?

前端 未结 2 1588
一向
一向 2021-01-05 10:59

There are several similar questions but none of them answers this simple question directly:

How can i catch a commands output and stream that content into numpy arra

2条回答
  •  日久生厌
    2021-01-05 11:30

    You can use Popen with stdout=subprocess.PIPE. Read in the header, then load the rest into a bytearray to use with np.frombuffer.

    Additional comments based on your edit:

    If you're going to call proc.stdout.read(), it's equivalent to using check_output(). Both create a temporary string. If you preallocate data, you could use proc.stdout.readinto(data). Then if the number of bytes read into data is less than len(data), free the excess memory, else extend data by whatever is left to be read.

    data = bytearray(2**32) # 4 GiB
    n = proc.stdout.readinto(data)
    if n < len(data):
        data[n:] = ''        
    else:
        data += proc.stdout.read()
    

    You could also come at this starting with a pre-allocated ndarray ndata and use buf = np.getbuffer(ndata). Then readinto(buf) as above.

    Here's an example to show that the memory is shared between the bytearray and the np.ndarray:

    >>> data = bytearray('\x01')
    >>> ndata = np.frombuffer(data, np.int8)
    >>> ndata
    array([1], dtype=int8)
    >>> ndata[0] = 2
    >>> data
    bytearray(b'\x02')
    

提交回复
热议问题