问题
I had a similar question here but this one is slightly different.
I would like to return values with matching conditions in another column based on a cut score criterion. If the cut scores are not available in the variable, I would like to grab closest larger value for the first and second cut, and grab the closest smallest value for the third cut. Here is a snapshot of dataset:
ids <- c(1,2,3,4,5,6,7,8,9,10)
scores.a <- c(512,531,541,555,562,565,570,572,573,588)
scores.b <- c(12,13,14,15,16,17,18,19,20,21)
data <- data.frame(ids, scores.a, scores.b)
> data
ids scores.a scores.b
1 1 512 12
2 2 531 13
3 3 541 14
4 4 555 15
5 5 562 16
6 6 565 17
7 7 570 18
8 8 572 19
9 9 573 20
10 10 588 21
cuts <- c(531, 560, 571)
I would like to grab score.b value corresponding to the first cut score, which is 13. Then, grab score.b value corresponding to the second cut (560) score but it is not in the score.a, so I would like to get the score.a value 562 (closest larger to 560), and the corresponding value would be 16. Lastly, for the third cut score (571), I would like to get 18 which is the corresponding value of the closest smaller value (570) to the third cut score.
Here is what I would like to get.
scores.b
cut.1 13
cut.2 16
cut.3 18
Any thoughts? Thanks
回答1:
data %>%
mutate(cts = Hmisc::cut2(scores.a, cuts = cuts)) %>%
group_by(cts) %>%
summarise( mn = min(scores.b),
mx = max(scores.b)) %>%
slice(-c(1,4)) %>% unlist() %>% .[c(3,4,6)] %>%
data.frame() %>%
magrittr::set_colnames("scores.b") %>%
magrittr::set_rownames(c("cut.1", "cut.2", "cut.3"))
scores.b
cut.1 13
cut.2 16
cut.3 18
回答2:
Using tidyverse:
data %>%
mutate(cuts_new = cut(scores.a, breaks = c(531,560,570, 1000), right = F)) %>%
group_by(cuts_new) %>% summarise(first_sb = first(scores.b)) %>%
ungroup()
results in:
# A tibble: 4 x 2
cuts_new first_sb
<fct> <dbl>
1 [531,560) 13
2 [560,570) 16
3 [570,1e+03) 18
4 NA 12
来源:https://stackoverflow.com/questions/59882916/subset-values-with-matching-criteria-in-r