How to I factorize a list of tuples?

后端 未结 6 2022
春和景丽
春和景丽 2020-12-06 10:43

definition
factorize: Map each unique object into a unique integer. Typically, the range of integers mapped to is from zero to the n - 1 where n is

6条回答
  •  忘掉有多难
    2020-12-06 10:58

    Approach #1

    Convert each tuple to a row of a 2D array, view each of those rows as one scalar using the views concept of NumPy ndarray and finally use np.unique(... return_inverse=True) to factorize -

    np.unique(get_row_view(np.array(tups)), return_inverse=1)[1]
    

    get_row_view is taken from here.

    Sample run -

    In [23]: tups
    Out[23]: [(1, 2), ('a', 'b'), (3, 4), ('c', 5), (6, 'd'), ('a', 'b'), (3, 4)]
    
    In [24]: np.unique(get_row_view(np.array(tups)), return_inverse=1)[1]
    Out[24]: array([0, 3, 1, 4, 2, 3, 1])
    

    Approach #2

    def argsort_unique(idx):
        # Original idea : https://stackoverflow.com/a/41242285/3293881 
        n = idx.size
        sidx = np.empty(n,dtype=int)
        sidx[idx] = np.arange(n)
        return sidx
    
    def unique_return_inverse_tuples(tups):
        a = np.array(tups)
        sidx = np.lexsort(a.T)
        b = a[sidx]
        mask0 = ~((b[1:,0] == b[:-1,0]) & (b[1:,1] == b[:-1,1]))
        ids = np.concatenate(([0], mask0  ))
        np.cumsum(ids, out=ids)
        return ids[argsort_unique(sidx)]
    

    Sample run -

    In [69]: tups
    Out[69]: [(1, 2), ('a', 'b'), (3, 4), ('c', 5), (6, 'd'), ('a', 'b'), (3, 4)]
    
    In [70]: unique_return_inverse_tuples(tups)
    Out[70]: array([0, 3, 1, 2, 4, 3, 1])
    

提交回复
热议问题