Can't do linear regression in scikit-Learn due to “reshaping” issue

后端 未结 4 772
故里飘歌
故里飘歌 2021-01-06 15:36

I have a simple CSV with two columns:

  1. ErrorWeek (a number for the week number in the year)
  2. ErrorCount (for the number of errors in a given week)
4条回答
  •  长情又很酷
    2021-01-06 16:06

    Apparently sklearn wants x to be a pandas.core.frame.DataFrame because it cannot distinguish between a single feature with n samples or n features with one sample. Instead y can be one single column, that is a pandas.core.series.Series. Therefore, in your example, you should transform x to a pandas.core.frame.DataFrame.

    As already pointed out by @MaxU:

    x=df[['ErrorWeek']]   # double brakets
    y=df['ErrorCount']    # single brakets
    X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
    

提交回复
热议问题