问题
Following is the simplified version of my problem. I want to create a (N, 1)
shape numpy array, which would have strings as their values. However, when I try to insert the string, only the first character of the string gets inserted.
What am I doing wrong here?
>>> import numpy as np
>>> N = 23000
>>> Y = np.empty((N, 1), dtype=str)
>>> Y.shape
(23000, 1)
>>> for i in range(N):
... Y[i] = "random string"
...
>>> Y[10]
array(['r'], dtype='<U1')
回答1:
By default data type str
takes length of 1
. So, you will only get one character. we can set max data length by using np.dtype('U100')
. Un
where U
is unicode and n
is number of characters in it.
Try below code
>>> import numpy as np
>>> N = 23000
>>> Y = np.empty((N, 1), dtype=np.dtype('U100'))
>>> Y.shape
(23000, 1)
>>> for i in range(N):
... Y[i] = "random string"
...
>>> Y[10]
array(['random string'], dtype='<U100')
回答2:
Even though you specify dtype=str
in np.empty
, when you check Y, it isn't string type.
import numpy as np
N = 23000
Y = np.empty((N, 1), dtype=str)
Y
Output:
array([[''],
[''],
[''],
...,
[''],
[''],
['']], dtype='<U1')
The dtype is "U1".
This means, its a unicode string of length 1.
You can change it to
Y = np.empty((N, 1), dtype='U25')
Output for Y[10]
:
array(['random string'], dtype='<U25')
I have given a random value as 25 for "U25". You can give any number there. 25 over here.
25 in U25 means unicode string of length 25
来源:https://stackoverflow.com/questions/55377213/numpy-taking-only-first-character-of-string