python linear regression predict by date

前端 未结 5 1973
旧时难觅i
旧时难觅i 2021-01-01 22:34

I want to predict a value at a date in the future with simple linear regression, but I can\'t due to the date format.

This is the dataframe I have:



        
5条回答
  •  盖世英雄少女心
    2021-01-01 22:37

    It is really important to differentiate the data types that you want to use for regression/classification.

    When you are using time series, that is another case but if you want to use time data as a numerical data type as your input, then you should transform your data type from datetime to float (if your data_df['conv_date] is a datetime object, if not then you should first transform it by using; data_df['conv_date'] = pd.to_datetime(data_df.date, format="%Y-%M-%D") )

    I agree with Thomas Vetterli's answer. It is useful to be careful what kind of time data you are using.

    If you are only using year and month data then dt.datetime.toordinal would be enough to use;

    >>import datetime
    >>data_df['conv_date'] = pd.to_datetime(data_df.date, format="%Y-%M-%D")
    >>data_df['conv_date'] = data_df['conv_date'].map(datetime.datetime.toordinal)
    737577
    

    But if you want to use also the hour, minute and second information then time.mktime() suits better;

    >>import time
    >>data_df['conv_date'] = pd.to_datetime(data_df.date, format="%Y-%M-%D")
    >>data_df['conv_date'] = data_df['conv_date'].apply(lambda  var: time.mktime(var.timetuple()))
    1591016041.0 
    

    Also 1591016044.0 is another exemplary output from my data, it varies with changes in seconds.

提交回复
热议问题