I would like to translate arbitrary integers in a numpy array to a contiguous range 0...n, like this:
source: [2 3 4 5 4 3]
translating [2 3 4 5] -> [0 1
IIUC you can simply use np.unique's optional argument return_inverse
, like so -
np.unique(source,return_inverse=True)[1]
Sample run -
In [44]: source
Out[44]: array([2, 3, 4, 5, 4, 3])
In [45]: np.unique(source,return_inverse=True)[1]
Out[45]: array([0, 1, 2, 3, 2, 1])
pandas.factorize is one method:
import pandas as pd
lst = [2, 3, 4, 5, 4, 3]
res = pd.factorize(lst, sort=True)[0]
# [0 1 2 3 2 1]
Note: this returns a list, while np.unique
will always return an np.ndarray
.