Why is it that the numpy array column data type does not get updated?

£可爱£侵袭症+ 提交于 2019-12-11 15:15:20

问题


nd2values[:,[1]]=nd2values[:,[1]].astype(int)
nd2values

outputs

array([['021fd159b55773fba8157e2090fe0fe2', '1',
        '881f83d2dee3f18c7d1751659406144e',
        '012059d397c0b7e5a30a5bb89c0b075e', 'A'],
       ['021fd159b55773fba8157e2090fe0fe2', '1',
        'cec898a1d355dbfbad8c760615fde1af',
        '012059d397c0b7e5a30a5bb89c0b075e', 'A'],
       ['021fd159b55773fba8157e2090fe0fe2', '1',
        'a99f44bbff39e352191a870e17f04537',
        '012059d397c0b7e5a30a5bb89c0b075e', 'A'],
       ...,
       ['fdeb2950c4d5209d449ebd2d6afac11e', '4',
        '4f4e47023263931e1445dc97f7dae941',
        '3cd0b15957ceb80f5125bef8bd1bbea7', 'A'],
       ['fdeb2950c4d5209d449ebd2d6afac11e', '4',
        '021dabc5d7a1404ec8ad34fe8ca4b5e3',
        '3cd0b15957ceb80f5125bef8bd1bbea7', 'A'],
       ['fdeb2950c4d5209d449ebd2d6afac11e', '4',
        'f79a2b5e6190ac3c534645e806f1b611',
        '3cd0b15957ceb80f5125bef8bd1bbea7', 'A']], dtype='<U32')

The data type of the second column is still str. Is it because this particular numpy array has dtype restriction? How would you change the second column to int? Thanks.

np.array(nd2values,dtype=[str,int,str,str,str])

gives

TypeError: data type not understood

回答1:


The assignement is casting your ints to the type of the array. To be able to hold all kind of objects in an array set the dtype to object.

nd2values = nd2values.astype(object)

then

nd2values[:,[1]]=nd2values[:,[1]].astype(int)



回答2:


A structured array alternative:

A copy-n-paste from the question gives me a (6,5) array with U32 dtype:

In [96]: arr.shape
Out[96]: (6, 5)

define a compound dtype:

In [99]: dt = np.dtype([('f0','U32'),('f1',int),('f2','U32'),('f3','U32'),('f4','U1')])

Input to a structured array should be a list of tuples:

In [100]: arrS = np.array([tuple(x) for x in arr], dt)
In [101]: arrS
Out[101]: 
array([('021fd159b55773fba8157e2090fe0fe2', 1, '881f83d2dee3f18c7d1751659406144e', '012059d397c0b7e5a30a5bb89c0b075e', 'A'),
       ('021fd159b55773fba8157e2090fe0fe2', 1, 'cec898a1d355dbfbad8c760615fde1af', '012059d397c0b7e5a30a5bb89c0b075e', 'A'),
       ('021fd159b55773fba8157e2090fe0fe2', 1, 'a99f44bbff39e352191a870e17f04537', '012059d397c0b7e5a30a5bb89c0b075e', 'A'),
       ('fdeb2950c4d5209d449ebd2d6afac11e', 4, '4f4e47023263931e1445dc97f7dae941', '3cd0b15957ceb80f5125bef8bd1bbea7', 'A'),
       ('fdeb2950c4d5209d449ebd2d6afac11e', 4, '021dabc5d7a1404ec8ad34fe8ca4b5e3', '3cd0b15957ceb80f5125bef8bd1bbea7', 'A'),
       ('fdeb2950c4d5209d449ebd2d6afac11e', 4, 'f79a2b5e6190ac3c534645e806f1b611', '3cd0b15957ceb80f5125bef8bd1bbea7', 'A')],
      dtype=[('f0', '<U32'), ('f1', '<i8'), ('f2', '<U32'), ('f3', '<U32'), ('f4', '<U1')])

One field can be accessed by name:

In [102]: arrS['f1']
Out[102]: array([1, 1, 1, 4, 4, 4])


来源:https://stackoverflow.com/questions/51291797/why-is-it-that-the-numpy-array-column-data-type-does-not-get-updated

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!