zoo | 易学教程

Convert from annual to quarterly data, constrained to annual average

阅读更多关于 Convert from annual to quarterly data, constrained to annual average

问题 I have several variables at annual frequency in R that I would like to include in a regression analysis with other variables available at quarterly frequency. Additionally, I would like to be able to convert the quarterly data back to annual frequency in a way that reproduces the original annual data. My current approach when converting from low frequency to high frequency time series data is to use the na.spline function in the zoo package. However, I don’t see how to constrain the quarterly

ZooKeeper：程序大牛教你学习ZooKeeper，其实ZooKeeper并不难

阅读更多关于 ZooKeeper：程序大牛教你学习ZooKeeper，其实ZooKeeper并不难

前言 ZooKeeper是一个分布式的，开放源码的分布式应用程序协调服务，是Google的Chubby一个开源的实现，是Hadoop和Hbase的重要组件。它是一个为分布式应用提供一致性服务的软件，提供的功能包括：配置维护、域名服务、分布式同步、组服务等。 ZooKeeper的目标就是封装好复杂易出错的关键服务，将简单易用的接口和性能高效、功能稳定的系统提供给用户。 ZooKeeper简介 ZooKeeper是一个开放源码的分布式应用程序协调服务，它包含一个简单的原语集，分布式应用程序可以基于它实现同步服务，配置维护和命名服务等。 ZooKeeper设计目的 1.最终一致性：client不论连接到哪个Server，展示给它都是同一个视图，这是zookeeper最重要的性能。 2.可靠性：具有简单、健壮、良好的性能，如果消息m被到一台服务器接受，那么它将被所有的服务器接受。 3.实时性：Zookeeper保证客户端将在一个时间间隔范围内获得服务器的更新信息，或者服务器失效的信息。但由于网络延时等原因，Zookeeper不能保证两个客户端能同时得到刚更新的数据，如果需要最新数据，应该在读数据之前调用sync()接口。 4.等待无关（wait-free）：慢的或者失效的client不得干预快速的client的请求，使得每个client都能有效的等待。 5.原子性：更新只能成功或者失败

Aggregate daily level data to weekly level in R

阅读更多关于 Aggregate daily level data to weekly level in R

问题 I have a huge dataset similar to the following reproducible sample data. Interval value 1 2012-06-10 552 2 2012-06-11 4850 3 2012-06-12 4642 4 2012-06-13 4132 5 2012-06-14 4190 6 2012-06-15 4186 7 2012-06-16 1139 8 2012-06-17 490 9 2012-06-18 5156 10 2012-06-19 4430 11 2012-06-20 4447 12 2012-06-21 4256 13 2012-06-22 3856 14 2012-06-23 1163 15 2012-06-24 564 16 2012-06-25 4866 17 2012-06-26 4421 18 2012-06-27 4206 19 2012-06-28 4272 20 2012-06-29 3993 21 2012-06-30 1211 22 2012-07-01 698 23

Efficient way to perform running total in the last 365 day window

阅读更多关于 Efficient way to perform running total in the last 365 day window

问题 This is what my data frame looks like: library(data.table) df <- fread(' Name EventType Date SalesAmount RunningTotal Runningtotal(prior365Days) John Email 1/1/2014 0 0 0 John Sale 2/1/2014 10 10 10 John Sale 7/1/2014 20 30 30 John Sale 4/1/2015 30 60 50 John Webinar 5/1/2015 0 60 50 Tom Email 1/1/2014 0 0 0 Tom Sale 2/1/2014 15 15 15 Tom Sale 7/1/2014 10 25 25 Tom Sale 4/1/2015 25 50 35 Tom Webinar 5/1/2015 0 50 35 ') df[,Date:= as.Date(Date, format="%m/%d/%Y")] The last column was my

Efficiently removing missing values from the start and end of multiple time series in 1 data frame

阅读更多关于 Efficiently removing missing values from the start and end of multiple time series in 1 data frame

问题 Using R, I'm trying to trim NA values from the start and end of a data frame that contains multiple time series. I have achieved my goal using a for loop and the zoo package, but as expected it is extremely inefficient on large data frames. My data frame look like this and contains 3 columns with each time series identified by it's unique id. In this case AAA, B and CCC. id date value AAA 2010/01/01 NA AAA 2010/02/01 34 AAA 2010/03/01 35 AAA 2010/04/01 30 AAA 2010/05/01 NA AAA 2010/06/01 28 B

rollapply with “growing” window

阅读更多关于 rollapply with “growing” window

问题 Guys, normally when you do something like: tmp = zoo(rnorm(100), 1:100) rollapply(tmp, 10, function(x) quantile(x, 0.05), align="right") Quite rightly rollapply will start calculating the value from the moment 10 elements are available. Unfortunately I need something that uses as much data as possible for the fist 10 observations, essentially a growing window of data till there is enough data to use a sliding window, e.g. 1, 1:2, 1:3, 1:4, etc. till we have at least 10 elements and then slide

rollapply with “growing” window

阅读更多关于 rollapply with “growing” window

R/zoo: index entries in ‘order.by’ are not unique

阅读更多关于 R/zoo: index entries in ‘order.by’ are not unique

问题 I have a .csv file containing 4 columns of data against a column of dates/times at one-minute intervals. Some timestamps are missing, so I'm trying to generate the missing dates/times and assign them NA values in the Y columns. I have previously done this with other .csv files with exactly the same formatting, with no issues. The code is: # read the csv file har10 = read.csv(fpath, header=TRUE); # set date har10$HAR.TS<-as.POSIXct(har10$HAR.TS,format="%y/%m/%d %H:%M") # convert to zoo df1.zoo

Does rollapply() allow an array of results from call to function?

阅读更多关于 Does rollapply() allow an array of results from call to function?

问题 # Loading packages require(forecast) require(quantmod) # Loading OHLC xts object getSymbols('SPY', from = '1950-01-01') # Selecting weekly Close prices x <- Cl(to.weekly(SPY)) # ARIMA(p,d,q) estimation and forecasting function a.ari.fun <- function(x) { a.ari <- auto.arima(x = x, d = 1, max.p = 50, max.q = 50, max.P = 50, max.Q = 50, ic = 'aic', approximation = TRUE) fore <- forecast.Arima(object = a.ari, h = 4, level = c(.9)) supp <- tail(fore$lower, 1) rest <- tail(fore$upper, 1) return(c

Modifying Plot in ggplot2 using as.yearmon from zoo

阅读更多关于 Modifying Plot in ggplot2 using as.yearmon from zoo

问题 I have created a graph in ggplot2 using zoo to create month bins. However, I want to be able to modify the graph so it looks like a standard ggplot graph. This means that the bins that aren't used are dropped and the bins that are populate the entire bin space. Here is my code: library(data.table) library(ggplot2) library(scales) library(zoo) testset <- data.table(Date=as.Date(c("2013-07-02","2013-08-03","2013-09-04","2013-10-05","2013-11-06","2013-07-03","2013-08-04","2013-09-05","2013-10-06