zoo

rolling computations in xts by month

时间秒杀一切 提交于 2019-12-01 08:52:32
I am familiar with the zoo function rollapply which allows you to do rolling computations on zoo or xts objects and you can specify the rolling increment via the by parameter. I am specifically interested in applying a function every month but using all of the past daily data in the computation. For example say my data set looks like this: dte, val 1/01/2001, 10 1/02/2001, 11 ... 1/31/2001, 2 2/01/2001, 54 2/02/2001, 34 ... 2/30/2001, 29 I would like to select the end of each month and apply a function that uses all the daily data. This doesn't seem like it would work with rollapply since the

Convert from annual to quarterly data, constrained to annual average

孤人 提交于 2019-12-01 06:50:21
I have several variables at annual frequency in R that I would like to include in a regression analysis with other variables available at quarterly frequency. Additionally, I would like to be able to convert the quarterly data back to annual frequency in a way that reproduces the original annual data. My current approach when converting from low frequency to high frequency time series data is to use the na.spline function in the zoo package. However, I don’t see how to constrain the quarterly data to match the corresponding annual average. As a result, when I convert the data back from

Feeding an hourly zoo time-series into function stl()

痴心易碎 提交于 2019-12-01 06:08:17
问题 Before you ask, yes I need to show this much data. stl() requires two periods of data. In this case, one period is 24 values, so stl() wants at least 48 values. Also, from stl() help: "....This should be an object of class "ts" with a frequency greater than one...." I'm upgrading some old calcs so that my data is in the zoo format. So far, I've upgraded monthly and daily data without any noticeable problems, although there have been stl() speed issues. I'm now down to hourly data. When you

zoo/xts - can't do math on 1-cell subsets? R hangs

喜夏-厌秋 提交于 2019-12-01 03:55:30
问题 I'm using latest version of R/xts/zoo on Windows: R 2.15, xts 0.8-6, zoo 1.7-7 I'm seeing the following bizarre behavior, which was not the case with prior versions: library(xts) data(sample_matrix) sample.xts <- as.xts(sample_matrix) sample.xts[1, 2] - sample.xts[2,2] # results in numeric(0)?!?!?! (sample.xts[ 1, 2] - sample.xts[2,2])/sample.xts[3,1] # if I run this twice R locks up Here I have subset an XTS object to a single cell. Subtraction no longer works. Also, division causes R to

na.locf but don't do trailing NAs

这一生的挚爱 提交于 2019-12-01 03:51:36
I have the following time series > y<- xts(1:10, Sys.Date()+1:10) > y[c(1,2,5,9,10)] <- NA > y [,1] 2011-09-04 NA 2011-09-05 NA 2011-09-06 3 2011-09-07 4 2011-09-08 NA 2011-09-09 6 2011-09-10 7 2011-09-11 8 2011-09-12 NA 2011-09-13 NA A straight na.locf give me this: > na.locf(y) [,1] 2011-09-04 NA 2011-09-05 NA 2011-09-06 3 2011-09-07 4 2011-09-08 4 2011-09-09 6 2011-09-10 7 2011-09-11 8 2011-09-12 8 2011-09-13 8 how do i get to this? [,1] 2011-09-04 NA 2011-09-05 NA 2011-09-06 3 2011-09-07 4 2011-09-08 4 2011-09-09 6 2011-09-10 7 2011-09-11 8 2011-09-12 NA 2011-09-13 NA I dont want last

Length of Trend - Panel Data

社会主义新天地 提交于 2019-12-01 00:36:07
I have a well balanced panel data set which contains NA observations. I will be using LOCF, and would like to know how many consecutive NA's are in each panel, before carrying observations forward. LOCF is a procedure where by missing values can be "filled in" using the "last observation carried forward". This can make sense it some time-series applications; perhaps we have weather data in 5 minute increments: a good guess at the value of a missing observation might be an observation made 5 minutes earlier. Obviously, it makes more sense to carry an observation forward one hour within one

R: Efficiently subsetting dataframe based on time of day

邮差的信 提交于 2019-11-30 22:31:24
I have a large (150,000x7) dataframe that I intend to use for back-testing and real-time analysis of a financial market. The data represents the condition of an investment vehicle at 5 minute intervals ( although holes do exist ). It looks like this (but much longer): pTime Time Price M1 M2 M3 M4 1 1212108300 20:45:00 1.5518 12.21849 -0.37125 4.50549 -31.00559 2 1212108900 20:55:00 1.5516 11.75350 -0.81792 -1.53846 -32.12291 3 1212109200 21:00:00 1.5512 10.75070 -1.47438 -8.24176 -34.35754 4 1212109500 21:05:00 1.5514 10.23529 -1.06044 -8.46154 -33.24022 5 1212109800 21:10:00 1.5514 9.74790 -1

Replacing all NAs with smoothing spline

元气小坏坏 提交于 2019-11-30 21:32:00
Below is the sample data (out of approximately 8000 rows of data). How can I replace all NAs with values from a smoothing spline fit to the rest of the data? Date Max Min Rain RHM RHE 4/24/1981 35.9 24.7 0.0 71 37 4/25/1981 36.8 22.8 0.0 62 40 4/26/1981 36.0 22.6 0.0 47 37 4/27/1981 35.1 24.2 0.0 51 39 4/28/1981 35.4 23.8 0.0 61 47 4/29/1981 35.4 25.1 0.0 67 43 4/30/1981 37.4 24.8 0.0 72 34 5/1/1981 NA NA NA NA NA 5/2/1981 39.0 25.3 NA NA 55 5/3/1981 35.9 23.0 0.0 68 66 5/4/1981 28.4 22.4 0.7 70 30 5/5/1981 35.5 24.6 0.0 47 31 5/6/1981 37.4 25.5 0.0 51 31 I'm using some simplified data for the

Length of Trend - Panel Data

允我心安 提交于 2019-11-30 19:38:36
问题 I have a well balanced panel data set which contains NA observations. I will be using LOCF, and would like to know how many consecutive NA's are in each panel, before carrying observations forward. LOCF is a procedure where by missing values can be "filled in" using the "last observation carried forward". This can make sense it some time-series applications; perhaps we have weather data in 5 minute increments: a good guess at the value of a missing observation might be an observation made 5

R: Efficiently subsetting dataframe based on time of day

随声附和 提交于 2019-11-30 18:04:05
问题 I have a large (150,000x7) dataframe that I intend to use for back-testing and real-time analysis of a financial market. The data represents the condition of an investment vehicle at 5 minute intervals ( although holes do exist ). It looks like this (but much longer): pTime Time Price M1 M2 M3 M4 1 1212108300 20:45:00 1.5518 12.21849 -0.37125 4.50549 -31.00559 2 1212108900 20:55:00 1.5516 11.75350 -0.81792 -1.53846 -32.12291 3 1212109200 21:00:00 1.5512 10.75070 -1.47438 -8.24176 -34.35754 4