zoo

Rolling window over irregular time series

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-28 05:33:19
I have an irregular time series of events (posts) using xts , and I want to calculate the number of events that occur over a rolling weekly window (or biweekly, or 3 day, etc). The data looks like this: postid 2010-08-04 22:28:07 867 2010-08-04 23:31:12 891 2010-08-04 23:58:05 901 2010-08-05 08:35:50 991 2010-08-05 13:28:02 1085 2010-08-05 14:14:47 1114 2010-08-05 14:21:46 1117 2010-08-05 15:46:24 1151 2010-08-05 16:25:29 1174 2010-08-05 23:19:29 1268 2010-08-06 12:15:42 1384 2010-08-06 15:22:06 1403 2010-08-07 10:25:49 1550 2010-08-07 18:58:16 1596 2010-08-07 21:15:44 1608 which should

R convert between zoo object and data frame, results inconsistent for different numbers of columns?

送分小仙女□ 提交于 2019-11-27 19:11:30
I have difficulty switching between data frames and zoo objects, particularly keeping meaningful column names, and inconsistencies between univariate and multivariate cases: library(zoo) #sample data, two species counts over time t = as.Date(c("2012-01-01", "2012-01-02", "2012-01-03", "2012-01-04")) n1 = c(4, 5, 9, 7) #counts of Lepisma saccharina n2 = c(2, 6, 0, 11) #counts of Thermobia domestica df = data.frame(t, n1, n2) colnames(df) <- c("Date", "Lepisma saccharina", "Thermobia domestica") #converting to zoo loses column names in univariate case... > z1 <- read.zoo(df[,1:2]) #time series

R: adding 1 month to a date

血红的双手。 提交于 2019-11-27 17:52:40
问题 I want to get the date sequence between a startDate and endDate by adding 1 month to the startDate . ie, if startDate is 2013-01-31 and endDate is 2013-07-31, I would prefer to see dates like this: "2013-01-31" "2013-02-28" "2013-03-31" "2013-04-30" "2013-05-31" "2013-06-30" "2013-07-31" I have tried seq.Date(as.Date("2013-01-31"),by="month",length.out=7) . But the output of this code is like this > seq.Date(as.Date("2013-01-31"),by="month",length.out=7) [1] "2013-01-31" "2013-03-03" "2013-03

Add missing xts/zoo data with linear interpolation in R

半城伤御伤魂 提交于 2019-11-27 16:38:29
问题 I do have problems with missing data, but I do not have NAs - otherwise would be easier to handle... My data looks like this: time, value 2012-11-30 10:28:00, 12.9 2012-11-30 10:29:00, 5.5 2012-11-30 10:30:00, 5.5 2012-11-30 10:31:00, 5.5 2012-11-30 10:32:00, 9 2012-11-30 10:35:00, 9 2012-11-30 10:36:00, 14.4 2012-11-30 10:38:00, 12.6 As you can see - there are missing some minute values - it is xts/zoo so I use as.POSIXct... to set the date as an index. How to add the missing timesteps to

mahout从入门到放弃--安装(1)

試著忘記壹切 提交于 2019-11-27 16:32:14
1.稀里糊涂下载 我的集群是hadoop 2.7.3 ,本来想找到对应的mahout版本,但是没有找到。本着安全原则,mahout最新版本是0.14.0,回退一个版本使用0.13.0 mahout地址 2.安装后 一波操作:解压到D:Zoo和配置好环境变量后,运行报错!!! D:\Zoo\apache-mahout-distribution-0.13.0\bin>mahout "===============DEPRECATION WARNING===============" "This script is no longer supported for new drivers as of Mahout 0.10.0" "Mahout's bash script is supported and if someone wants to contribute a fix for this" "it would be appreciated." "Mahout home set D:\Zoo\mahout-0.14.0" "ERROR: Could not find mahout-examples-*.job in D:\Zoo\mahout-0.14.0 or D:\Zoo\mahout-0.14.0/examples/target, please run 'mvn install

Compute rolling sum by id variables, with missing timepoints

半城伤御伤魂 提交于 2019-11-27 12:38:49
问题 I'm trying to learn R and there are a few things I've done for 10+ years in SAS that I cannot quite figure out the best way to do in R. Take this data: id class t count desired -- ----- ---------- ----- ------- 1 A 2010-01-15 1 1 1 A 2010-02-15 2 3 1 B 2010-04-15 3 3 1 B 2010-09-15 4 4 2 A 2010-01-15 5 5 2 B 2010-06-15 6 6 2 B 2010-08-15 7 13 2 B 2010-09-15 8 21 I want to calculate the column desired as a rolling sum by id, class, and within a 4 months rolling window. Notice that not all

Date format for plotting x axis ticks of time series data

岁酱吖の 提交于 2019-11-27 11:49:25
问题 The data files have date is the format i.e. 1975M1, 1975M2, ... 2011M12 for time series data. when plotting this data using R, I want the x-axis to display the months on tick axis. For the dates to be read properly, I have tried replacing the M by - to get %Y-%m format but that doesnt seem good for drawTimeAxis from hydroTSM package which perhaps requires %Y-%M-%d format. It gives error that incorrect number of dimensions for ticks dimension. Another method of parsing and formatting the data

Fill NA in a time series only to a limited number

梦想与她 提交于 2019-11-27 08:56:21
Is there a way we can fill NA s in a zoo or xts object with limited number of NA s forward. In other words like fill NA s up to 3 consecutive NA s, and then keep the NA s from the 4th value on until a valid number. Something like this. library(zoo) x <- zoo(1:20, Sys.Date() + 1:20) x[c(2:4, 6:10, 13:18)] <- NA x 2014-09-20 2014-09-21 2014-09-22 2014-09-23 2014-09-24 2014-09-25 2014-09-26 1 NA NA NA 5 NA NA 2014-09-27 2014-09-28 2014-09-29 2014-09-30 2014-10-01 2014-10-02 2014-10-03 NA NA NA 11 12 NA NA 2014-10-04 2014-10-05 2014-10-06 2014-10-07 2014-10-08 2014-10-09 NA NA NA NA 19 20 Desired

Add months of zero demand to zoo time series

為{幸葍}努か 提交于 2019-11-27 08:09:18
问题 I have some intermittent demand data that only includes lines where demand is present. I bring it in via read.csv, and my 2 columns are Date (as date) and Quantity (as integer). Then I convert it to a zoo series and combine the daily demand into monthly demand. My final output is a zoo series with the date being the first day of the month and the summed demand for that month. My problem is that this zoo series is missing the in between months that have zero demand and I need these to forecast

optimized rolling functions on irregular time series with time-based window

别来无恙 提交于 2019-11-27 07:50:14
Is there some way to use rollapply (from zoo package or something similar) optimized functions ( rollmean , rollmedian etc) to compute rolling functions with a time-based window, instead of one based on a number of observations? What I want is simple: for each element in an irregular time series, I want to compute a rolling function with a N-days window. That is, the window should include all the observations up to N days before the current observation. Time series may also contain duplicates. Here follows an example. Given the following time series: date value 1/11/2011 5 1/11/2011 4 1/11