问题
nd2values[:,[1]]=nd2values[:,[1]].astype(int)
nd2values
outputs
array([['021fd159b55773fba8157e2090fe0fe2', '1',
'881f83d2dee3f18c7d1751659406144e',
'012059d397c0b7e5a30a5bb89c0b075e', 'A'],
['021fd159b55773fba8157e2090fe0fe2', '1',
'cec898a1d355dbfbad8c760615fde1af',
'012059d397c0b7e5a30a5bb89c0b075e', 'A'],
['021fd159b55773fba8157e2090fe0fe2', '1',
'a99f44bbff39e352191a870e17f04537',
'012059d397c0b7e5a30a5bb89c0b075e', 'A'],
...,
['fdeb2950c4d5209d449ebd2d6afac11e', '4',
'4f4e47023263931e1445dc97f7dae941',
'3cd0b15957ceb80f5125bef8bd1bbea7', 'A'],
['fdeb2950c4d5209d449ebd2d6afac11e', '4',
'021dabc5d7a1404ec8ad34fe8ca4b5e3',
'3cd0b15957ceb80f5125bef8bd1bbea7', 'A'],
['fdeb2950c4d5209d449ebd2d6afac11e', '4',
'f79a2b5e6190ac3c534645e806f1b611',
'3cd0b15957ceb80f5125bef8bd1bbea7', 'A']], dtype='<U32')
The data type of the second column is still str
. Is it because this particular numpy array has dtype
restriction? How would you change the second column to int
? Thanks.
np.array(nd2values,dtype=[str,int,str,str,str])
gives
TypeError: data type not understood
回答1:
The assignement is casting your ints to the type of the array. To be able to hold all kind of objects in an array set the dtype to object.
nd2values = nd2values.astype(object)
then
nd2values[:,[1]]=nd2values[:,[1]].astype(int)
回答2:
A structured array alternative:
A copy-n-paste from the question gives me a (6,5) array with U32
dtype:
In [96]: arr.shape
Out[96]: (6, 5)
define a compound dtype:
In [99]: dt = np.dtype([('f0','U32'),('f1',int),('f2','U32'),('f3','U32'),('f4','U1')])
Input to a structured array should be a list of tuples:
In [100]: arrS = np.array([tuple(x) for x in arr], dt)
In [101]: arrS
Out[101]:
array([('021fd159b55773fba8157e2090fe0fe2', 1, '881f83d2dee3f18c7d1751659406144e', '012059d397c0b7e5a30a5bb89c0b075e', 'A'),
('021fd159b55773fba8157e2090fe0fe2', 1, 'cec898a1d355dbfbad8c760615fde1af', '012059d397c0b7e5a30a5bb89c0b075e', 'A'),
('021fd159b55773fba8157e2090fe0fe2', 1, 'a99f44bbff39e352191a870e17f04537', '012059d397c0b7e5a30a5bb89c0b075e', 'A'),
('fdeb2950c4d5209d449ebd2d6afac11e', 4, '4f4e47023263931e1445dc97f7dae941', '3cd0b15957ceb80f5125bef8bd1bbea7', 'A'),
('fdeb2950c4d5209d449ebd2d6afac11e', 4, '021dabc5d7a1404ec8ad34fe8ca4b5e3', '3cd0b15957ceb80f5125bef8bd1bbea7', 'A'),
('fdeb2950c4d5209d449ebd2d6afac11e', 4, 'f79a2b5e6190ac3c534645e806f1b611', '3cd0b15957ceb80f5125bef8bd1bbea7', 'A')],
dtype=[('f0', '<U32'), ('f1', '<i8'), ('f2', '<U32'), ('f3', '<U32'), ('f4', '<U1')])
One field can be accessed by name:
In [102]: arrS['f1']
Out[102]: array([1, 1, 1, 4, 4, 4])
来源:https://stackoverflow.com/questions/51291797/why-is-it-that-the-numpy-array-column-data-type-does-not-get-updated