Moving window method to aggregate data

流过昼夜 提交于 2020-01-11 13:42:12

问题


I have the matrix below:

 mat<- matrix(c(1,0,0,0,0,0,1,0,0,0,0,0,0,0,2,0,
       2,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,
       0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,
       0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
       0,0,0,0,1,0,0,1,0,1,1,0,0,1,0,1,
       1,1,0,0,0,0,0,0,1,0,1,2,1,0,0,0), nrow=16, ncol=6)
 dimnames(mat)<- list(c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"), 
              c("1", "2", "3", "4", "5", "6"))

I need to aggregate columns using a moving window method. First, the window size will be 2, such that the window is comprised of 2 columns. Row sums are taken for this aggregate. The window will shift by one step and again take row sums. For the example data frame provided, the first columns to be aggregated are columns 1&2, the second window will combine column 2&3, then 3&4, then 4&5 and 5&6.

These results (row sums for each aggregate) are put into a matrix. In this matrix rows are conserved and columns now represent the results for each aggregate.

Next, the moving window size will increase to a size of 3. Such that 3 columns of data are combined (summed). Similarly, the window shifts 1 step. For the example data frame provided, the first columns to be aggregated are columns 1-2-3, the second window will combine columns 2-3-4, then 3-4-5, 4-5-6. Results are put into a separate matrix.

The size of the moving window will continue to increase until the window is the size of all columns. In this example, the largest window combines all 6 plots.

Below are result matrices for window sizes 2 and 3 given the example matrix above mat. Columns are named according to the columns that were added.

#Window length =2 
mat1<- matrix( c(3,0,0,0,1,0,1,0,0,0,0,0,0,0,2,0,
         2,0,1,1,2,0,0,0,0,0,0,0,0,0,1,0,
         0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,
         0,1,0,0,1,1,0,1,0,1,1,0,0,1,0,1,
         1,1,0,0,1,0,0,1,1,1,2,2,1,1,0,1), nrow=16)
dimnames(mat1)<- list(c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"), 
              c("1_2", "2_3", "3_4", "4_5", "5_6"))

 #Window length 3
 mat8<- matrix( c(3,0,1,1,2,0,1,0,0,0,0,0,0,0,3,0,
         2,1,1,1,2,1,0,0,0,0,0,0,0,0,1,0,
         0,1,1,1,2,1,0,1,0,1,1,0,0,1,0,1,
         1,2,0,0,1,1,0,1,1,1,2,2,1,1,0,1), nrow=16)
 dimnames(mat8)<- list(c("a", "c", "f", "h", "i", "j", "l", "m", "p", "q", "s", "t", "u", "v","x", "z"), 
              c("1_2_3", "2_3_4", "3_4_5", "4_5_6"))

In my example I have 6 columns, so there would be 5 result matrices total. In the event I had 600 columns of data, I am thinking a loop is the most efficient way to iterate over a large dataset.


回答1:


Here is one way in base R

lapply(seq_len(ncol(mat) - 1), function(j) do.call(cbind, 
   lapply(seq_len(ncol(mat) - j), function(i) rowSums(mat[, i:(i + j)]))))


#[[1]]
#  [,1] [,2] [,3] [,4] [,5]
#a    3    2    0    0    1
#c    0    0    1    1    1
#f    0    1    1    0    0
#h    0    1    1    0    0
#i    1    2    1    1    1
#j    0    0    1    1    0
#l    1    0    0    0    0
#m    0    0    0    1    1
#p    0    0    0    0    1
#q    0    0    0    1    1
#s    0    0    0    1    2
#t    0    0    0    0    2
#u    0    0    0    0    1
#v    0    0    0    1    1
#x    3    1    0    0    0
#z    0    0    0    1    1

#[[2]]
#  [,1] [,2] [,3] [,4]
#a    3    2    0    1
#c    0    1    1    2
#f    1    1    1    0
#h    1    1    1    0
#i    2    2    2    1
#j    0    1    1    1
#l    1    0    0    0
#m    0    0    1    1
#p    0    0    0    1
#q    0    0    1    1
#s    0    0    1    2
#t    0    0    0    2
#u    0    0    0    1
#v    0    0    1    1
#x    3    1    0    0
#z    0    0    1    1
#....

As this is a rolling operation, we can also use rollapply from zoo with a variable window-width

lapply(2:ncol(mat), function(j)
    t(zoo::rollapply(seq_len(ncol(mat)), j, function(x) rowSums(mat[,x]))))


来源:https://stackoverflow.com/questions/58021216/moving-window-method-to-aggregate-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!