I need to be able to create a python function for forecasting based on linear regression model with confidence bands on time-series data:
The function needs to take an a
Scikit is a great module for python
The X and Y vars must be separated into two arrays where if you were to plot them (X,Y) the index from one would match with the other array.
So I guess in your time series data you would separate the X and Y values as follows:
null = None
time_series = [{"target": "average", "datapoints": [[null, 1435688679], [34.870499801635745, 1435688694], [null, 1435688709], [null, 1435688724], [null, 1435688739], [null, 1435688754], [null, 1435688769], [null, 1435688784], [null, 1435688799], [null, 1435688814], [null, 1435688829], [null, 1435688844], [null, 1435688859], [null, 1435688874], [null, 1435688889], [null, 1435688904], [null, 1435688919], [null, 1435688934], [null, 1435688949], [null, 1435688964], [null, 1435688979], [38.180000209808348, 1435688994], [null, 1435689009], [null, 1435689024], [null, 1435689039], [null, 1435689054], [null, 1435689069], [null, 1435689084], [null, 1435689099], [null, 1435689114], [null, 1435689129], [null, 1435689144], [null, 1435689159], [null, 1435689174], [null, 1435689189], [null, 1435689204], [null, 1435689219], [null, 1435689234], [null, 1435689249], [null, 1435689264], [null, 1435689279], [30.79849989414215, 1435689294], [null, 1435689309], [null, 1435689324], [null, 1435689339], [null, 1435689354], [null, 1435689369], [null, 1435689384], [null, 1435689399], [null, 1435689414], [null, 1435689429], [null, 1435689444], [null, 1435689459], [null, 1435689474], [null, 1435689489], [null, 1435689504], [null, 1435689519], [null, 1435689534], [null, 1435689549], [null, 1435689564]]}]
# assuming the time series is the X axis value
input_X_vars = []
input_Y_vars = []
for pair in time_series[0]["datapoints"]:
if (pair[0] != None and pair[1] != None):
input_X_vars.append(pair[1])
input_Y_vars.append(pair[0])
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
regr = linear_model.LinearRegression()
regr.fit(input_X_vars, input_Y_vars)
test_X_vars = [1435688681, 1435688685]
results = regr.predict(test_X_vars)
forecast_append = {"target": "Lower", "datapoints": results}
time_series.append(forecast_append)
I set null as None as the 'null' keyword is parsed as simply a variable in python.