time-series

How to conver 'Sat Feb 02 12:50:00 IST 2019' to regular datetime in python?

与世无争的帅哥 提交于 2020-06-29 03:41:07
问题 I am trying to convert this column of my dataframe 'Sat Feb 02 12:50:00 IST 2019' to regular datetime format ie(2019-05-02 12:00:00) in python How do i convert all the rows to this format? 回答1: Assuming you don't need your datetime Python object to be timezone aware, you could just use strptime as follows: dt = "Sat Feb 02 12:50:00 IST 2019" out = datetime.strptime(dt, "%a %b %d %H:%M:%S IST %Y") print(out) This prints: 2019-02-02 12:50:00 来源: https://stackoverflow.com/questions/62370051/how

Neural network in R for predictions 1 day, 2 days and 3 days ahead

ε祈祈猫儿з 提交于 2020-06-29 03:39:16
问题 I am using ann in R. I have (time series) daily data which has 6400 rows and 3 input variables A, B, C and 2 output variables D, E. I can predict D and E variables based on input A, B and C. This is what I have tried: data <- data.frame(A, B, C, D, E) index <- 1:5844 datatrain = data[index, ] datatest = data[-index, ] max = apply(data , 2 , max) min = apply(data, 2 , min) scaled = as.data.frame(scale(data, center = min, scale = max - min)) train = scaled[index , ] test = scaled[-index , ] NN

Rolling window function for irregular time series that can handle duplicates

耗尽温柔 提交于 2020-06-27 14:11:10
问题 I have the following data.frame: grp nr yr 1: A 1.0 2009 2: A 2.0 2009 3: A 1.5 2009 4: A 1.0 2010 5: B 3.0 2009 6: B 2.0 2010 7: B NA 2011 8: C 3.0 2014 9: C 3.0 2019 10: C 3.0 2020 11: C 4.0 2021 Desired output: grp nr yr nr_roll_period_3 1 A 1.0 2009 NA 2 A 2.0 2009 NA 3 A 1.5 2009 NA 4 A 1.0 2010 NA 5 B 3.0 2009 NA 6 B 2.0 2010 NA 7 B NA 2011 NA 8 C 3.0 2014 NA 9 C 3.0 2019 NA 10 C 3.0 2020 NA 11 C 4.0 2021 3.333333 The logic: I want to calculate a rolling mean for the period of length k

Rolling window function for irregular time series that can handle duplicates

陌路散爱 提交于 2020-06-27 14:04:52
问题 I have the following data.frame: grp nr yr 1: A 1.0 2009 2: A 2.0 2009 3: A 1.5 2009 4: A 1.0 2010 5: B 3.0 2009 6: B 2.0 2010 7: B NA 2011 8: C 3.0 2014 9: C 3.0 2019 10: C 3.0 2020 11: C 4.0 2021 Desired output: grp nr yr nr_roll_period_3 1 A 1.0 2009 NA 2 A 2.0 2009 NA 3 A 1.5 2009 NA 4 A 1.0 2010 NA 5 B 3.0 2009 NA 6 B 2.0 2010 NA 7 B NA 2011 NA 8 C 3.0 2014 NA 9 C 3.0 2019 NA 10 C 3.0 2020 NA 11 C 4.0 2021 3.333333 The logic: I want to calculate a rolling mean for the period of length k

Rolling window function for irregular time series that can handle duplicates

旧时模样 提交于 2020-06-27 14:04:24
问题 I have the following data.frame: grp nr yr 1: A 1.0 2009 2: A 2.0 2009 3: A 1.5 2009 4: A 1.0 2010 5: B 3.0 2009 6: B 2.0 2010 7: B NA 2011 8: C 3.0 2014 9: C 3.0 2019 10: C 3.0 2020 11: C 4.0 2021 Desired output: grp nr yr nr_roll_period_3 1 A 1.0 2009 NA 2 A 2.0 2009 NA 3 A 1.5 2009 NA 4 A 1.0 2010 NA 5 B 3.0 2009 NA 6 B 2.0 2010 NA 7 B NA 2011 NA 8 C 3.0 2014 NA 9 C 3.0 2019 NA 10 C 3.0 2020 NA 11 C 4.0 2021 3.333333 The logic: I want to calculate a rolling mean for the period of length k

Select every nth row as a Pandas DataFrame without reading the entire file

微笑、不失礼 提交于 2020-06-27 08:52:21
问题 I am reading a large file that contains ~9.5 million rows x 16 cols. I am interested in retrieving a representative sample, and since the data is organized by time, I want to do this by selecting every 500th element. I am able to load the data, and then select every 500th row. My question: Can I immediately read every 500th element (using.pd.read_csv() or some other method), without having to read first and then filter my data? Question 2: How would you approach this problem if the date

Using Linear Regression for Yearly distributed Time Series Data to get predictions after -N- years

家住魔仙堡 提交于 2020-06-26 14:56:15
问题 I am stuck with a very unique problem. I have Time Series Data where the data is given from the years 2009 to 2018. Problem is that I am to answer a very weird question using this data. Data sheets contains the energy generation statistics of each Australian State/Territory in GWh (​ Gigawatt​ hours) for the year 2009 to 2018. There are following fields: State: Names of different Australian states. Fuel_Type: ​ The type of fuel which is consumed. Category: ​ Determines whether a fuel is

Temporal train-test split for forecasting

断了今生、忘了曾经 提交于 2020-06-17 14:48:08
问题 I know this may be a basic question but I want to know if I am using the train, test split correctly. Say I have data that ends at 2019, and I want to predict values in the next 5 years. The graph I produced is provided below: My training data starts from 1996-2014 and my test data starts from 2014-2019. The test data perfectly fits the training data. I then used this test data to make predictions from 2019-2024. Is this the correct way to do it, or my predictions should also be from 2014

How to get distance matrix using dynamic time wraping?

隐身守侯 提交于 2020-06-17 09:32:08
问题 I have 6 timeseries values as follows. import numpy as np series = np.array([ [0., 0, 1, 2, 1, 0, 1, 0, 0], [0., 1, 2, 0, 0, 0, 0, 0, 0], [1., 2, 0, 0, 0, 0, 0, 1, 1], [0., 0, 1, 2, 1, 0, 1, 0, 0], [0., 1, 2, 0, 0, 0, 0, 0, 0], [1., 2, 0, 0, 0, 0, 0, 1, 1]]) Suppose, I want to get the distance matrix of dynamic time warping to perform a clustering. I used dtaidistance library for that as follows. from dtaidistance import dtw ds = dtw.distance_matrix_fast(series) The output I got was as

convert a irregular time series of a data table with factors into a regular time series in R

风格不统一 提交于 2020-06-17 09:07:12
问题 I am trying to convert a irregular time series of a data table into a regular time series. My data looks like this library(data.table) dtRes <- data.table(time = c(0.1, 0.8, 1, 2.3, 2.4, 4.8, 4.9), abst = c(1, 1, 1, 0, 0, 3, 3), farbe = as.factor(c("keine", "keine", "keine", "keine", "keine", "rot", "blau")), gier = c(2.5, 2.5, 2.5, 0, 0, 3, 3), goff = as.factor(c("haus", "maus", "toll", "maus", NA, "maus", "maus")), huft = as.factor(c(NA, NA, NA, "wolle", "wolle", "holz", "holz")), mode = c