time-series

Drop dates based on condition in python

↘锁芯ラ 提交于 2020-06-01 06:57:05
问题 I'm trying to implement a condition where if the count of incorrect values is greater than 2 (2019-05-17 & 2019-05-20 in the example below) then the complete date (all the time blocks) is removed Input t_value C/IC 2019-05-17 00:00:00 0 incorrect 2019-05-17 01:00:00 0 incorrect 2019-05-17 02:00:00 0 incorrect 2019-05-17 03:00:00 4 correct 2019-05-17 04:00:00 5 correct 2019-05-18 01:00:00 0 incorrect 2019-05-18 02:00:00 6 correct 2019-05-18 03:00:00 7 correct 2019-05-19 04:00:00 0 incorrect

R, ggplot: Change linetype within a series

女生的网名这么多〃 提交于 2020-05-31 07:05:19
问题 I am using ggplot geom_smooth to plot turnover data of a customer group from previous year against the current year (based on calendar weeks). As the last week is not complete, I would like to use a dashed linetype for the last week. However, I can't figure out how to that. I can either change the linetype for the entire plot or an entire series, but not within a series (depending on the value of x): To keep it simple, let's just use the following example: set.seed(42) frame <- data.frame

R combining duplicate rows in a time series with different column types in a datatable

本秂侑毒 提交于 2020-05-29 10:51:22
问题 The bounty expires tomorrow . Answers to this question are eligible for a +50 reputation bounty. Bolle is looking for an up-to-date answer to this question. This question is building up on another question R combining duplicate rows by ID with different column types in a dataframe. I have a datatable with a column time and some other columns of different types (factors and numerics). Here is an example: dt <- data.table(time = c(1, 1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4, 4), abst = c(0, NA, 2, NA,

I want to simulate moving average process of order one MA(1) with varying sample size n, varying SD values and varying theta values

人盡茶涼 提交于 2020-05-28 06:27:47
问题 I want to simulate some time series data with mean = 0 but varying: Mathematically, moving average process of order one, MA(1) is presented as $$x_t=\mu+\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}$$ $x_t$ is the MA(1) process $\mu$ is the mean which can be zero in my case (just like intercept in regression equation) $\varepsilon_{t}$ is the error term $\theta_{1}$ is a constant which need be specified (in my case, a varying number in between +-1). Example: in simple regression equation of $x

How to draw a frame on a matplotlib figure

前提是你 提交于 2020-05-25 07:55:54
问题 I want to show the frame in this figure. I tried running the code below but it didnt work : ax = self.canvas.figure.add_subplot(111) ax.spines['top'].set_visible(True) ax.spines['right'].set_visible(True) ax.spines['bottom'].set_visible(True) ax.spines['left'].set_visible(True) ax.plot(diff) I also tried : plt.tight_layout() but it generates this error : > File "runme.py", line 54, in autocorr_function > plt.tight_layout() File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py",

pandas.DatetimeIndex frequency is None and can't be set

会有一股神秘感。 提交于 2020-05-22 15:04:49
问题 I created a DatetimeIndex from a "date" column: sales.index = pd.DatetimeIndex(sales["date"]) Now the index looks as follows: DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-04', '2003-01-06', '2003-01-07', '2003-01-08', '2003-01-09', '2003-01-10', '2003-01-11', '2003-01-13', ... '2016-07-22', '2016-07-23', '2016-07-24', '2016-07-25', '2016-07-26', '2016-07-27', '2016-07-28', '2016-07-29', '2016-07-30', '2016-07-31'], dtype='datetime64[ns]', name='date', length=4393, freq=None) As you see

Writing a function to get the sums of columns C/D the last time columns A/B are a specific value?

戏子无情 提交于 2020-05-17 08:45:28
问题 I have a datasheet with sports results. They are labeled something like this, where column A is the home team, column B is the away team, column C is the home score, column D is the away score, and column E is the final result. Also has a date column which I've left off for the purpose of this, but it is there. PIT PHI 4 5 Away PIT BOS 3 5 Away BOS SJS 3 2 Home SJS PHI 1 1 Draw PIT SJS 3 2 Home PHI BOS 4 3 Home What I would like to do is add two columns to this dataframe. The first should

Create And Maintain System State Based On Action

孤者浪人 提交于 2020-05-17 06:45:27
问题 I have a data set like this. The field set is meant to represent the user 's currently in the system: time user action set ---------------------------------- 1:00 A walk NaN 2:00 B run NaN 3:00 C sit NaN 4:00 D enter NaN 5:00 E jump NaN 6:00 X leave NaN ... I in order to achieve this, I need to: 1) Figure out the starting state 2) Any time a user has action =="enter", add them to the state 3) Any time a user has action =="leave", remove them from the state **(a user can enter and leave

Create a series of timestamps along milliseconds in R

荒凉一梦 提交于 2020-05-16 20:33:05
问题 I have start date as "2017-11-14 10:11:01 CET" and end date as "2017-11-15 01:15:59 CET". I want to create timestamps of 500 milliseconds each, between these start and end timestamps. E.g. I need a dataframe containing the following output timestamp 2017-11-14 10:11:01.000 2017-11-14 10:11:01.500 2017-11-14 10:11:02.000 2017-11-14 10:11:02.500 . . . 2017-11-15 01:15:59.000 Somehow, I managed to get the start and end date in milliseconds using the following code: start.date <- strptime("2017

Python & Pandas - Group by day and count for each day

岁酱吖の 提交于 2020-05-10 07:58:11
问题 I am new on pandas and for now i don't get how to arrange my time serie, take a look at it : date & time of connection 19/06/2017 12:39 19/06/2017 12:40 19/06/2017 13:11 20/06/2017 12:02 20/06/2017 12:04 21/06/2017 09:32 21/06/2017 18:23 21/06/2017 18:51 21/06/2017 19:08 21/06/2017 19:50 22/06/2017 13:22 22/06/2017 13:41 22/06/2017 18:01 23/06/2017 16:18 23/06/2017 17:00 23/06/2017 19:25 23/06/2017 20:58 23/06/2017 21:03 23/06/2017 21:05 This is a sample of a dataset of 130 k raws,I tried :