Pythonic way to create a numpy array from a list of numpy arrays

后端 未结 6 1085
执笔经年
执笔经年 2020-12-13 06:24

I generate a list of one dimensional numpy arrays in a loop and later convert this list to a 2d numpy array. I would\'ve preallocated a 2d numpy array if i knew the number o

6条回答
  •  醉酒成梦
    2020-12-13 07:03

    What you are doing is the standard way. A property of numpy arrays is that they need contiguous memory. The only possibility of "holes" that I can think of is possible with the strides member of PyArrayObject, but that doesn't affect the discussion here. Since numpy arrays have contiguous memory and are "preallocated", adding a new row/column means allocating new memory, copying data, and then freeing the old memory. If you do that a lot, it is not very efficient.

    One case where someone might not want to create a list and then convert it to a numpy array in the end is when the list contains a lot of numbers: a numpy array of numbers takes much less space than a native Python list of numbers (since the native Python list stores Python objects). For your typical array sizes, I don't think that is an issue.

    When you create your final array from a list of arrays, you are copying all the data to a new location for the new (2-d in your example) array. This is still much more efficient than having a numpy array and doing next = numpy.vstack((next, new_row)) every time you get new data. vstack() will copy all the data for every "row".

    There was a thread on numpy-discussion mailing list some time ago which discussed the possibility of adding a new numpy array type that allows efficient extending/appending. It seems there was significant interest in this at that time, although I don't know if something came out of it. You might want to look at that thread.

    I would say that what you're doing is very Pythonic, and efficient, so unless you really need something else (more space efficiency, maybe?), you should be okay. That is how I create my numpy arrays when I don't know the number of elements in the array in the beginning.

提交回复
热议问题