Linear regression analysis with string/categorical features (variables)?

前端 未结 4 1681
面向向阳花
面向向阳花 2020-11-30 18:43

Regression algorithms seem to be working on features represented as numbers. For example:

This data set doesn\'t contain categorical features/variables. It

4条回答
  •  感动是毒
    2020-11-30 19:14

    One way to achieve regression with categorical variables as independent variables is as mentioned above - Using encoding. Another way of doing is by using R like statistical formula using statmodels library. Here is a code snippet

    from statsmodels.formula.api import ols
    tips = sns.load_dataset("tips")
    
    model = ols('tip ~ total_bill + C(sex) + C(day) + C(day) + size', data=tips)
    fitted_model = model.fit()
    fitted_model.summary()
    

    Dataset

    total_bill  tip     sex  smoker day  time  size
    0   16.99   1.01    Female  No  Sun Dinner  2
    1   10.34   1.66    Male    No  Sun Dinner  3
    2   21.01   3.50    Male    No  Sun Dinner  3
    3   23.68   3.31    Male    No  Sun Dinner  2
    4   24.59   3.61    Female  No  Sun Dinner  4
    

    Summary of regression

提交回复
热议问题