RandomForestClassfier.fit(): ValueError: could not convert string to float

后端 未结 8 903
礼貌的吻别
礼貌的吻别 2020-12-23 09:16

Given is a simple CSV file:

A,B,C
Hello,Hi,0
Hola,Bueno,1

Obviously the real dataset is far more complex than this, but this one reproduces

8条回答
  •  攒了一身酷
    2020-12-23 09:40

    I had a similar issue and found that pandas.get_dummies() solved the problem. Specifically, it splits out columns of categorical data into sets of boolean columns, one new column for each unique value in each input column. In your case, you would replace train_x = test[cols] with:

    train_x = pandas.get_dummies(test[cols])
    

    This transforms the train_x Dataframe into the following form, which RandomForestClassifier can accept:

       C  A_Hello  A_Hola  B_Bueno  B_Hi
    0  0        1       0        0     1
    1  1        0       1        1     0
    

提交回复
热议问题