Map a NumPy array of strings to integers

前端未结

关注

 2  1191

Problem:

Given an array of string data

dataSet = np.array([\'kevin\', \'greg\', \'george\', \'kevin\'], dtype=\'U21\'),

相关标签:

2条回答

长发绾君心

2020-12-07 01:42

np.searchsorted does the trick:

dataSet = np.array(['kevin', 'greg', 'george', 'kevin'], dtype='U21'), 
lut = np.sort(np.unique(dataSet))  # [u'george', u'greg', u'kevin']
ind = np.searchsorted(lut,dataSet) # array([[2, 1, 0, 2]])

0 讨论(0)

粉色の甜心

2020-12-07 01:46
You can use np.unique with the return_inverse argument:
```
>>> lookupTable, indexed_dataSet = np.unique(dataSet, return_inverse=True)
>>> lookupTable
array(['george', 'greg', 'kevin'], 
      dtype='<U21')
>>> indexed_dataSet
array([2, 1, 0, 2])
```
If you like, you can reconstruct your original array from these two arrays:
```
>>> lookupTable[indexed_dataSet]
array(['kevin', 'greg', 'george', 'kevin'], 
      dtype='<U21')
```
If you use pandas, lookupTable, indexed_dataSet = pd.factorize(dataSet) will achieve the same thing (and potentially be more efficient for large arrays).
0 讨论(0)
发布评论:

提交评论
- 加载中...