Irregular Numpy matrix

馋奶兔 提交于 2019-12-25 16:47:43

问题


In Numpy, it appears that the matrix can simply be a nested list of anything not limited to numbers. For example

import numpy as np

a = [[1,2,5],[3,'r']]
b = np.matrix(a)

generates no complaints.

What is the purpose of this tolerance when list can treat the object that is not a matrix in the strict mathematical sense?


回答1:


What you've created is an object dtype array:

In [302]: b=np.array([[1,2,5],[3,'r']])
In [303]: b
Out[303]: array([[1, 2, 5], [3, 'r']], dtype=object)
In [304]: b.shape
Out[304]: (2,)
In [305]: b[0]
Out[305]: [1, 2, 5]
In [306]: b[1]=None
In [307]: b
Out[307]: array([[1, 2, 5], None], dtype=object)

The elements of this array are pointers - pointers to objects else where in memory. It has a data buffer just like other arrays. In this case 2 pointers, 2

In [308]: b.__array_interface__
Out[308]: 
{'data': (169809984, False),
 'descr': [('', '|O')],
 'shape': (2,),
 'strides': None,
 'typestr': '|O',
 'version': 3}
In [309]: b.nbytes
Out[309]: 8
In [310]: b.itemsize
Out[310]: 4

It is very much like a list - which also stores object pointers in a buffer. But it differs in that it doesn't have an append method, but does have all the array ones like .reshape.

And for many operations, numpy treats such an array like a list - iterating over the pointers, etc. Many of the math operations that work with numeric values fail with object dtypes.

Why allow this? Partly it's just a generalization, expanding the concept of element values/dtypes beyond the simple numeric and string ones. numpy also allows compound dtypes (structured arrays). MATLAB expanded their matrix class to include cells, which are similar.

I see a lot of questions on SO about object arrays. Sometimes they are produced in error, Creating numpy array from list gives wrong shape.

Sometimes they are created intentionally. pandas readily changes a data series to object dtype to accommodate a mix of values (string, nan, int).

np.array() tries to create as high a dimension array as it can, resorting to object dtype only when it can't, for example when the sublists differ in length. In fact you have to resort to special construction methods to create an object array when the sublists are all the same.

This is still an object array, but the dimension is higher:

In [316]: np.array([[1,2,5],[3,'r',None]])
Out[316]: 
array([[1, 2, 5],
       [3, 'r', None]], dtype=object)
In [317]: _.shape
Out[317]: (2, 3)


来源:https://stackoverflow.com/questions/40666695/irregular-numpy-matrix

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!