发表新帖

发表新帖

Is it possible to map a discontiuous data on disk to an array with python?

前端未结

关注

 1  1924

半阙折子戏 2020-12-06 07:01

I want to map a big fortran record (12G) on hard disk to a numpy array. (Mapping instead of loading for saving memory.)

The data stored in fortran record is not cont

1条回答

抹茶落季 (楼主)

2020-12-06 07:31
I posted another answer because for the example given here numpy.memmap worked:
```
offset = 0
data1 = np.memmap('tmp', dtype='i', mode='r+', order='F',
                  offset=0, shape=(size1))
offset += size1*byte_size
data2 = np.memmap('tmp', dtype='i', mode='r+', order='F',
                  offset=offset, shape=(size2))
offset += size1*byte_size
data3 = np.memmap('tmp', dtype='i', mode='r+', order='F',
                  offset=offset, shape=(size3))
```
for int32 byte_size=32/8, for int16 byte_size=16/8 and so forth...

If the sizes are constant, you can load the data in a 2D array like:
```
shape = (total_length/size,size)
data = np.memmap('tmp', dtype='i', mode='r+', order='F', shape=shape)
```
You can change the memmap object as long as you want. It is even possible to make arrays sharing the same elements. In that case the changes made in one are automatically updated in the other.

Other references:
- Working with big data in python and numpy, not enough ram, how to save partial results on disc?
- numpy.memmap documentation here.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题