Numpy taking only first character of string

こ雲淡風輕ζ 提交于 2021-02-07 07:52:50

问题


Following is the simplified version of my problem. I want to create a (N, 1) shape numpy array, which would have strings as their values. However, when I try to insert the string, only the first character of the string gets inserted.

What am I doing wrong here?

>>> import numpy as np
>>> N = 23000
>>> Y = np.empty((N, 1), dtype=str)
>>> Y.shape
(23000, 1)
>>> for i in range(N):
...     Y[i] = "random string"
...
>>> Y[10]
array(['r'], dtype='<U1')

回答1:


By default data type str takes length of 1. So, you will only get one character. we can set max data length by using np.dtype('U100'). Un where U is unicode and n is number of characters in it.

Try below code

>>> import numpy as np
>>> N = 23000
>>> Y = np.empty((N, 1), dtype=np.dtype('U100'))
>>> Y.shape
(23000, 1)
>>> for i in range(N):
...     Y[i] = "random string"
...
>>> Y[10]
array(['random string'], dtype='<U100')



回答2:


Even though you specify dtype=str in np.empty, when you check Y, it isn't string type.

import numpy as np
N = 23000
Y = np.empty((N, 1), dtype=str)
Y

Output:

array([[''],
       [''],
       [''],
       ...,
       [''],
       [''],
       ['']], dtype='<U1')

The dtype is "U1".

This means, its a unicode string of length 1.

You can change it to

Y = np.empty((N, 1), dtype='U25')

Output for Y[10]:

array(['random string'], dtype='<U25')

I have given a random value as 25 for "U25". You can give any number there. 25 over here.

25 in U25 means unicode string of length 25



来源:https://stackoverflow.com/questions/55377213/numpy-taking-only-first-character-of-string

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!