问题
I am working with the following data frame:
Year Month Day X Y Color
2018 January 1 4.5 6 Red
2018 January 4 3.2 8.1 Red
2018 January 11 1.1 2.3 Blue
2018 February 7 5.4 2.2 Blue
2018 February 15 1.5 4.4 Red
2019 January 3 8.6 2.3 Red
2019 January 22 1.1 2.5 Blue
2019 January 23 5.5 7.8 Red
2019 February 5 6.9 1.1 Red
2019 February 10 1.8 1.3 Red
I am looking to create a new column that indicates the number of observations where x is greater than y and the color is 'red' for a given month.
Year Month Day X Y Color XGreaterThanYCount
2018 January 1 4.5 6 Red 0
2018 January 4 3.2 8.1 Red 0
2018 January 11 1.1 2.3 Blue 0
2018 February 7 5.4 2.2 Blue 0
2018 February 15 1.5 4.4 Red 0
2019 January 3 8.6 2.3 Red 1
2019 January 22 1.1 2.5 Blue 1
2019 January 23 5.5 7.8 Red 1
2019 February 5 6.9 1.1 Red 2
2019 February 10 1.8 1.3 Red 2
I posted a similar question to this a little while ago, I'm re-posting because I had to tweak the question a bit.
回答1:
We can create a logical expression (X > Y
, and (&
) Color == "Red"
) by group and get the sum
of the logical expression
library(dplyr)
df1 %>%
group_by(Year, Month) %>%
mutate(XGreaterThanYCount = sum(X > Y & Color == 'Red')) %>%
ungroup
-output
# A tibble: 10 x 7
# Year Month Day X Y Color XGreaterThanYCount
# <int> <chr> <int> <dbl> <dbl> <chr> <int>
# 1 2018 January 1 4.5 6 Red 0
# 2 2018 January 4 3.2 8.1 Red 0
# 3 2018 January 11 1.1 2.3 Blue 0
# 4 2018 February 7 5.4 2.2 Blue 0
# 5 2018 February 15 1.5 4.4 Red 0
# 6 2019 January 3 8.6 2.3 Red 1
# 7 2019 January 22 1.1 2.5 Blue 1
# 8 2019 January 23 5.5 7.8 Red 1
# 9 2019 February 5 6.9 1.1 Red 2
#10 2019 February 10 1.8 1.3 Red 2
Or using base R
with ave
df1$XGreaterThanYCount <- with(df1, ave(X > Y & Color == "Red",
Year, Month, FUN = sum))
data
df1 <- structure(list(Year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2019L,
2019L, 2019L, 2019L, 2019L), Month = c("January", "January",
"January", "February", "February", "January", "January", "January",
"February", "February"), Day = c(1L, 4L, 11L, 7L, 15L, 3L, 22L,
23L, 5L, 10L), X = c(4.5, 3.2, 1.1, 5.4, 1.5, 8.6, 1.1, 5.5,
6.9, 1.8), Y = c(6, 8.1, 2.3, 2.2, 4.4, 2.3, 2.5, 7.8, 1.1, 1.3
), Color = c("Red", "Red", "Blue", "Blue", "Red", "Red", "Blue",
"Red", "Red", "Red")), class = "data.frame", row.names = c(NA,
-10L))
来源:https://stackoverflow.com/questions/65383809/how-to-count-observations-with-certain-value-in-a-group-conditionally