How to generate pandas DataFrame column of Categorical from string column?

后端 未结 2 1981
臣服心动
臣服心动 2021-01-05 00:41

I can convert a pandas string column to Categorical, but when I try to insert it as a new DataFrame column it seems to get converted right back to Series of str:

<         


        
2条回答
  •  梦毁少年i
    2021-01-05 01:25

    The only workaround for pandas pre-0.15 I found is as follows:

    • column must be converted to a Categorical for classifier, but numpy will immediately coerce the levels back to int, losing the factor information
    • so store the factor in a global variable outside the dataframe

    .

    train_LocationNFactor = pd.Categorical.from_array(train['LocationNormalized']) # default order: alphabetical
    
    train['LocationNFactor'] = train_LocationNFactor.labels # insert in dataframe
    

    [UPDATE: pandas 0.15+ added decent support for Categorical]

提交回复
热议问题