multi-column factorize in pandas

后端未结

关注

 4  724

长发绾君心 2020-12-28 09:35

The pandas factorize function assigns each unique value in a series to a sequential, 0-based index, and calculates which index each series entry belongs to.

4条回答

自闭症患者 (楼主)

2020-12-28 10:00
I am not sure if this is an efficient solution. There might be better solutions for this.
```
arr=[] #this will hold the unique items of the dataframe
for i in df.index:
   if list(df.iloc[i]) not in arr:
      arr.append(list(df.iloc[i]))
```
so printing the arr would give you
```
>>>print arr
[[1,1],[1,2],[2,2]]
```
to hold the indices, i would declare an ind array
```
ind=[]
for i in df.index:
   ind.append(arr.index(list(df.iloc[i])))
```
printing ind would give
```
 >>>print ind
 [0,1,2,2,1,0]
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...