Is it possible to map a discontiuous data on disk to an array with python?

前端 未结 1 1924
半阙折子戏
半阙折子戏 2020-12-06 07:01

I want to map a big fortran record (12G) on hard disk to a numpy array. (Mapping instead of loading for saving memory.)

The data stored in fortran record is not cont

1条回答
  •  抹茶落季
    2020-12-06 07:31

    I posted another answer because for the example given here numpy.memmap worked:

    offset = 0
    data1 = np.memmap('tmp', dtype='i', mode='r+', order='F',
                      offset=0, shape=(size1))
    offset += size1*byte_size
    data2 = np.memmap('tmp', dtype='i', mode='r+', order='F',
                      offset=offset, shape=(size2))
    offset += size1*byte_size
    data3 = np.memmap('tmp', dtype='i', mode='r+', order='F',
                      offset=offset, shape=(size3))
    

    for int32 byte_size=32/8, for int16 byte_size=16/8 and so forth...

    If the sizes are constant, you can load the data in a 2D array like:

    shape = (total_length/size,size)
    data = np.memmap('tmp', dtype='i', mode='r+', order='F', shape=shape)
    

    You can change the memmap object as long as you want. It is even possible to make arrays sharing the same elements. In that case the changes made in one are automatically updated in the other.

    Other references:

    • Working with big data in python and numpy, not enough ram, how to save partial results on disc?

    • numpy.memmap documentation here.

    0 讨论(0)
提交回复
热议问题