Excel SUMIFS equivalent in R

﹥>﹥吖頭↗ 提交于 2019-11-27 22:29:48

问题


I'm very very new to R and am looking at ways of recreating an Excel VBA macro and Excel worksheet functions such as SUMIFS. SUMIFS sums a column if the row has entries matching multiple conditions on its other columns.

I have the below data frame and I want to compute a new column. The new column is the sum of Sample for all rows that overlap with the Start Date and EndDate range. For example on line 1 it would be 697 (the sum of the first 3 lines). The criteria for the sum specifically: include Sample if EndDate >= StartDate[i] & StartDate <=EndDate[i]

 StartDate   EndDate    Sample  *SUMIFS example*
 10/01/14   24/01/14    139         *697*
 12/01/14   26/01/14    136 
 19/01/14   02/02/14    422 
 25/01/14   08/02/14    762 
 29/01/14   12/02/14    899 
 05/02/14   19/02/14    850 
 07/02/14   21/02/14    602 
 09/02/14   23/02/14    180 
 18/02/14   04/03/14    866 

Any comments or pointers would be greatly appreciated.


回答1:


You could do this with a loop or with a Cartesian merge. I don't know of any built in functions to do exactly this.

library(dplyr)

x = structure(list(StartDate = structure(c(1389312000, 1389484800, 
1390089600, 1390608000, 1390953600, 1391558400, 1391731200, 1391904000, 
1392681600), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
    EndDate = structure(c(1390521600, 1390694400, 1391299200, 
    1391817600, 1392163200, 1392768000, 1392940800, 1393113600, 
    1393891200), tzone = "UTC", class = c("POSIXct", "POSIXt"
    )), Sample = c(139L, 136L, 422L, 762L, 899L, 850L, 602L, 
    180L, 866L)), .Names = c("StartDate", "EndDate", "Sample"
), row.names = c(NA, -9L), class = "data.frame")

x2 = x
names(x2)=c('StartDate2','EndDate2','Sample2')
x3 = merge(x,x2,allow.cartesian =T)
x4 = summarise(group_by(x3,StartDate,EndDate),
    sumifs=sum(Sample2[EndDate2 >= StartDate & StartDate2 <= EndDate]))
x_sumifs = merge(x,x4,by=c('StartDate','EndDate'))

This is what the output looks like.

> x_sumifs
   StartDate    EndDate Sample sumifs
1 2014-01-10 2014-01-24    139    697
2 2014-01-12 2014-01-26    136   1459
3 2014-01-19 2014-02-02    422   2358
4 2014-01-25 2014-02-08    762   3671
5 2014-01-29 2014-02-12    899   3715
6 2014-02-05 2014-02-19    850   4159
7 2014-02-07 2014-02-21    602   4159
8 2014-02-09 2014-02-23    180   3397
9 2014-02-18 2014-03-04    866   2498



回答2:


You could use lapply/sapply from base R to do this. x from @cameron.bracken's post.

x$sumifs <- sapply(seq_len(nrow(x)), function(i) with(x, 
             sum(Sample[EndDate >= StartDate[i] & StartDate <= EndDate[i]])))

x
#   StartDate    EndDate Sample sumifs
#1 2014-01-10 2014-01-24    139    697
#2 2014-01-12 2014-01-26    136   1459
#3 2014-01-19 2014-02-02    422   2358
#4 2014-01-25 2014-02-08    762   3671
#5 2014-01-29 2014-02-12    899   3715
#6 2014-02-05 2014-02-19    850   4159
#7 2014-02-07 2014-02-21    602   4159
#8 2014-02-09 2014-02-23    180   3397
#9 2014-02-18 2014-03-04    866   2498



回答3:


Assuming you have the above data in a data frame called df:

sum(df$Sample[EndDate >= df$StartDate & StartDate <= df$EndDate])

That is:

  • df$Sample[...] selects the Sample column, with conditions specified in [...]
  • EndDate >= df$StartDate and StartDate <= df$EndDate are from your example, converted to R conditions, with & in between to require both conditions to be true at the same time. Notice that there are no i indexes in the expression. That's how it works in R, the expression is evaluated for each row in the data frame, and the result of df$Sample[...] is a vector of values, only the values where the expression in [...] was true
  • sum is of course a built-in function to calculate the sum, naturally



回答4:


You can use the 'by' function to get the value. In 'by' data frame is split by row into data frames subsetted by the values of one or more factors, and a function is applied to each subset in turn.

x$sumifs <- by(Sample[EndDate >= StartDate[i] & StartDate <= EndDate[i]],sum)

More details about the function can be found here



来源:https://stackoverflow.com/questions/26720897/excel-sumifs-equivalent-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!