Convert from annual to quarterly data, constrained to annual average

后端 未结 3 2135
走了就别回头了
走了就别回头了 2021-01-06 08:16

I have several variables at annual frequency in R that I would like to include in a regression analysis with other variables available at quarterly frequency. Additionally,

3条回答
  •  梦毁少年i
    2021-01-06 08:46

    We could manipulate the output of na.spline to ensure that it averages to the annual values by shifting the 4 quarters' values or shifting the last 3 quarters' values. In the first case we would subtract the mean of the 4 quarters from each quarter and then add the annual value to each quarter. In the second case we subtract the mean of the last 3 quarters from the last 3 quarters and add the annual.

    In each case averaging the z_q_adj values over the four quarters of a year will recover the original annual value.

    Here are the two approaches mentioned:

    # 1
    yr <- format(time(c), "%Y")
    c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) x - mean(x) + x[1])
    

    giving:

    > c
               z_a      z_q   z_q_adj
    2000-01-01 100 100.0000  95.36604
    2000-04-01  NA 103.4434  98.80946
    2000-07-01  NA 106.4080 101.77405
    2000-10-01  NA 108.6844 104.05046
    2001-01-01 110 110.0000 109.39295
    2001-04-01  NA 110.5723 109.96527
    2001-07-01  NA 110.8719 110.26484
    2001-10-01  NA 110.9840 110.37694
    2002-01-01 111 111.0000 110.86116
    2002-04-01  NA 111.0150 110.87615
    2002-07-01  NA 111.1219 110.98311
    2002-10-01  NA 111.4184 111.27958
    
    
    # 2
    c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) c(x[1], x[-1] - mean(x[-1]) +x[1]))
    

    giving:

    > c
               z_a      z_q  z_q_adj
    2000-01-01 100 100.0000 100.0000
    2000-04-01  NA 103.4434  97.2648
    2000-07-01  NA 106.4080 100.2294
    2000-10-01  NA 108.6844 102.5058
    2001-01-01 110 110.0000 110.0000
    2001-04-01  NA 110.5723 109.7629
    2001-07-01  NA 110.8719 110.0625
    2001-10-01  NA 110.9840 110.1746
    2002-01-01 111 111.0000 111.0000
    2002-04-01  NA 111.0150 110.8299
    2002-07-01  NA 111.1219 110.9368
    2002-10-01  NA 111.4184 111.2333
    

    ADDED If you want to know whether a series was interpolated or not some approaches are:

    • add a comment to the series, e.g. comment(c) <- "Originally annual", or

    • use a naming convention, e.g. add _a to the series name if it was originally annual: c_a <- c, or

    • if it's OK to retain both the c_q and c_q_adj columns then for series that originated from quarterly data the two columns should be the same and otherwise not, or

    • keep a column for both the original data and the quarterly data

提交回复
热议问题