Convert categorical data in pandas dataframe

后端 未结 10 1884
予麋鹿
予麋鹿 2020-11-27 10:01

I have a dataframe with this type of data (too many columns):

col1        int64
col2        int64
col3        category
col4        category
col5        categ         


        
10条回答
  •  -上瘾入骨i
    2020-11-27 10:34

    If your concern was only that you making a extra column and deleting it later, just dun use a new column at the first place.

    dataframe = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':list('abcab'),  'col3':list('ababb')})
    dataframe.col3 = pd.Categorical.from_array(dataframe.col3).codes
    

    You are done. Now as Categorical.from_array is deprecated, use Categorical directly

    dataframe.col3 = pd.Categorical(dataframe.col3).codes
    

    If you also need the mapping back from index to label, there is even better way for the same

    dataframe.col3, mapping_index = pd.Series(dataframe.col3).factorize()
    

    check below

    print(dataframe)
    print(mapping_index.get_loc("c"))
    

提交回复
热议问题