Add ordered ID for each group by date

早过忘川 提交于 2021-01-29 07:56:47

问题


I want to add an ordered ID (by date) to each group in a data frame. I can do this using dplyr (R - add column that counts sequentially within groups but repeats for duplicates):

# Example data
date <- rep(c("2016-10-06 11:56:00","2016-10-05 11:56:00","2016-10-05 11:56:00","2016-10-07 11:56:00"),2)
date <- as.POSIXct(date)
group <- c(rep("A",4), rep("B",4))    
df <- data.frame(group, date)

# dplyr - dense_rank
df2 <- df %>% group_by(group) %>% 
       mutate(m.test=dense_rank(date))

   group                date m.test
  <fctr>              <dttm>  <int>
1      A 2016-10-06 11:56:00      2
2      A 2016-10-05 11:56:00      1
3      A 2016-10-05 11:56:00      1
4      A 2016-10-07 11:56:00      3
5      B 2016-10-06 11:56:00      2
6      B 2016-10-05 11:56:00      1
7      B 2016-10-05 11:56:00      1
8      B 2016-10-07 11:56:00      3

So my new column m.test ranks each group by date. If I use rleid and data.table, it doesn't seem to work (05/10 ranked after 06/10):

df3 <- setDT(df)[, m.test := rleid(date), by = group]

   group                date m.test
1:     A 2016-10-06 11:56:00      1
2:     A 2016-10-05 11:56:00      2
3:     A 2016-10-05 11:56:00      2
4:     A 2016-10-07 11:56:00      3
5:     B 2016-10-06 11:56:00      1
6:     B 2016-10-05 11:56:00      2
7:     B 2016-10-05 11:56:00      2
8:     B 2016-10-07 11:56:00      3

Am I getting the syntax wrong?


回答1:


Thanks to @docendo discimus, the correct way to do this with data.table is frank(..., ties.method = "dense"):

df4 <- setDT(df)[, m.test := frank(date, ties.method = "dense"), by = group]


来源:https://stackoverflow.com/questions/40588901/add-ordered-id-for-each-group-by-date

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!