发表新帖

发表新帖

Efficient Python array with 100 million zeros?

后端未结

关注

 10  710

挽巷 2020-12-08 19:56

What is an efficient way to initialize and access elements of a large array in Python?

I want to create an array in Python with 100 million entries, unsigned 4-byte

10条回答

我在风中等你 (楼主)

2020-12-08 20:26
Just a reminder how Python's integers work: if you allocate a list by saying
```
a = [0] * K
```
you need the memory for the list (sizeof(PyListObject) + K * sizeof(PyObject*)) and the memory for the single integer object 0. As long as the numbers in the list stay below the magic number V that Python uses for caching, you are fine because those are shared, i.e. any name that points to a number n < V points to the exact same object. You can find this value by using the following snippet:
```
>>> i = 0
>>> j = 0
>>> while i is j:
...    i += 1
...    j += 1
>>> i # on my system!
257 
```
This means that as soon as the counts go above this number, the memory you need is sizeof(PyListObject) + K * sizeof(PyObject*) + d * sizeof(PyIntObject), where d < K is the number of integers above V (== 256). On a 64 bit system, sizeof(PyIntObject) == 24 and sizeof(PyObject*) == 8, i.e. the worst case memory consumption is 3,200,000,000 bytes.

With numpy.ndarray or array.array, memory consumption is constant after initialization, but you pay for the wrapper objects that are created transparently, as Thomas Wouters said. Probably, you should think about converting the update code (which accesses and increases the positions in the array) to C code, either by using Cython or scipy.weave.
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...

热议问题