How to count observations with certain value in a group conditionally?

独自空忆成欢 提交于 2021-02-10 14:33:26

问题


I am working with the following data frame:

Year  Month      Day   X      Y      Color
2018  January    1     4.5    6       Red
2018  January    4     3.2    8.1     Red
2018  January    11    1.1    2.3     Blue
2018  February   7     5.4    2.2     Blue
2018  February   15    1.5    4.4     Red
2019  January    3     8.6    2.3     Red
2019  January    22    1.1    2.5     Blue
2019  January    23    5.5    7.8     Red
2019  February   5     6.9    1.1     Red
2019  February   10    1.8    1.3     Red

I am looking to create a new column that indicates the number of observations where x is greater than y and the color is 'red' for a given month.

Year  Month      Day   X      Y       Color   XGreaterThanYCount
2018  January    1     4.5    6        Red       0
2018  January    4     3.2    8.1      Red       0
2018  January    11    1.1    2.3      Blue      0
2018  February   7     5.4    2.2      Blue      0
2018  February   15    1.5    4.4      Red       0
2019  January    3     8.6    2.3      Red       1
2019  January    22    1.1    2.5      Blue      1
2019  January    23    5.5    7.8      Red       1
2019  February   5     6.9    1.1      Red       2
2019  February   10    1.8    1.3      Red       2

I posted a similar question to this a little while ago, I'm re-posting because I had to tweak the question a bit.


回答1:


We can create a logical expression (X > Y, and (&) Color == "Red") by group and get the sum of the logical expression

library(dplyr)
df1 %>% 
   group_by(Year, Month) %>% 
   mutate(XGreaterThanYCount = sum(X > Y & Color == 'Red')) %>%
   ungroup

-output

# A tibble: 10 x 7
#    Year Month      Day     X     Y Color XGreaterThanYCount
#   <int> <chr>    <int> <dbl> <dbl> <chr>              <int>
# 1  2018 January      1   4.5   6   Red                    0
# 2  2018 January      4   3.2   8.1 Red                    0
# 3  2018 January     11   1.1   2.3 Blue                   0
# 4  2018 February     7   5.4   2.2 Blue                   0
# 5  2018 February    15   1.5   4.4 Red                    0
# 6  2019 January      3   8.6   2.3 Red                    1
# 7  2019 January     22   1.1   2.5 Blue                   1
# 8  2019 January     23   5.5   7.8 Red                    1
# 9  2019 February     5   6.9   1.1 Red                    2
#10  2019 February    10   1.8   1.3 Red                    2

Or using base R with ave

df1$XGreaterThanYCount <-  with(df1, ave(X > Y & Color == "Red", 
             Year, Month, FUN = sum))

data

df1 <- structure(list(Year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2019L, 
2019L, 2019L, 2019L, 2019L), Month = c("January", "January", 
"January", "February", "February", "January", "January", "January", 
"February", "February"), Day = c(1L, 4L, 11L, 7L, 15L, 3L, 22L, 
23L, 5L, 10L), X = c(4.5, 3.2, 1.1, 5.4, 1.5, 8.6, 1.1, 5.5, 
6.9, 1.8), Y = c(6, 8.1, 2.3, 2.2, 4.4, 2.3, 2.5, 7.8, 1.1, 1.3
), Color = c("Red", "Red", "Blue", "Blue", "Red", "Red", "Blue", 
"Red", "Red", "Red")), class = "data.frame", row.names = c(NA, 
-10L))


来源:https://stackoverflow.com/questions/65383809/how-to-count-observations-with-certain-value-in-a-group-conditionally

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!