multi-column factorize in pandas

后端 未结 4 724
长发绾君心
长发绾君心 2020-12-28 09:35

The pandas factorize function assigns each unique value in a series to a sequential, 0-based index, and calculates which index each series entry belongs to.

4条回答
  •  自闭症患者
    2020-12-28 10:00

    I am not sure if this is an efficient solution. There might be better solutions for this.

    arr=[] #this will hold the unique items of the dataframe
    for i in df.index:
       if list(df.iloc[i]) not in arr:
          arr.append(list(df.iloc[i]))
    

    so printing the arr would give you

    >>>print arr
    [[1,1],[1,2],[2,2]]
    

    to hold the indices, i would declare an ind array

    ind=[]
    for i in df.index:
       ind.append(arr.index(list(df.iloc[i])))
    

    printing ind would give

     >>>print ind
     [0,1,2,2,1,0]
    

提交回复
热议问题