How to pipe binary data into numpy arrays without tmp storage?

前端 未结 2 1597
一向
一向 2021-01-05 10:59

There are several similar questions but none of them answers this simple question directly:

How can i catch a commands output and stream that content into numpy arra

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-05 11:31

    Since your data can easily fit in RAM, I think the easiest way to load the data into a numpy array is to use a ramfs.

    On Linux,

    sudo mkdir /mnt/ramfs
    sudo mount -t ramfs -o size=5G ramfs /mnt/ramfs
    sudo chmod 777 /mnt/ramfs
    

    Then, for example, if this is the producer of the binary data:

    writer.py:

    from __future__ import print_function
    import random
    import struct
    N = random.randrange(100)
    print('a b')
    for i in range(2*N):
        print(struct.pack('

    Then you could load it into a numpy array like this:

    reader.py:

    import subprocess
    import numpy
    
    def parse_header(f):
        # this function moves the filepointer and returns a dictionary
        header = f.readline()
        d = dict.fromkeys(header.split())
        return d
    
    filename = '/mnt/ramfs/data.out'
    with open(filename, 'w') as f:  
        cmd = 'writer.py'
        proc = subprocess.Popen([cmd], stdout = f)
        proc.communicate()
    with open(filename, 'r') as f:      
        header = parse_header(f)
        dt = numpy.dtype([(key, 'f8') for key in header.keys()])
        data = numpy.fromfile(f, dt)
    

提交回复
热议问题