Create group names for consecutive values

元气小坏坏 提交于 2019-11-26 03:45:05

问题


Looks like an easy task, can\'t figure out a simpler way. I have an x vector below, and need to create group names for consecutive values. My attempt was using rle, better ideas?

# data
x <- c(1,1,1,2,2,2,3,2,2,1,1)

# make groups
rep(paste0(\"Group_\", 1:length(rle(x)$lengths)), rle(x)$lengths)
# [1] \"Group_1\" \"Group_1\" \"Group_1\" \"Group_2\" \"Group_2\" \"Group_2\" \"Group_3\" \"Group_4\"
# [9] \"Group_4\" \"Group_5\" \"Group_5\"

回答1:


Using diff and cumsum :

paste0("Group_", cumsum(c(1, diff(x) != 0)))
#[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5"

(If your values are floating point values, you might have to avoid != and use a tolerance instead.)




回答2:


Using rleid from data.table,

library(data.table)

paste0('Group_', rleid(x))
 #[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5"



回答3:


Using cumsum but not relying on the data being numeric:

paste0("Group_", 1 + c(0, cumsum(x[-length(x)] != x[-1])))


[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5"



回答4:


group() from groupdata2 can create groups from a list of group starting points, using the l_starts method. By setting n to auto, it automatically finds group starts:

x <- c(1,1,1,2,2,2,3,2,2,1,1)
groupdata2::group(x, n = "auto", method = "l_starts")

## # A tibble: 11 x 2
## # Groups:   .groups [5]
##     data .groups
##    <dbl> <fct>  
##  1     1 1      
##  2     1 1      
##  3     1 1      
##  4     2 2      
##  5     2 2      
##  6     2 2      
##  7     3 3      
##  8     2 4      
##  9     2 4      
## 10     1 5      
## 11     1 5     

There's also the differs_from_previous() function which finds values, or indices of values, that differ from the previous value by some threshold(s).

# The values to start groups at
differs_from_previous(x, threshold = 1,
                      direction = "both")
## [1] 2 3 2 1

# The indices to start groups at
differs_from_previous(x, threshold = 1,
                      direction = "both",
                      return_index = TRUE)
## [1] 4 7 8 10


来源:https://stackoverflow.com/questions/37809094/create-group-names-for-consecutive-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!