Determine when columns of a data.frame change value and return indices of the change

半城伤御伤魂 提交于 2019-12-03 02:45:53

In data.table version 1.8.10 (stable version in CRAN), there's a(n) (unexported) function called duplist that does exactly this. And it's also written in C and is therefore terribly fast.

require(data.table) # 1.8.10
data.table:::duplist(x[, 3:5]) 
# [1] 1 4 5

If you're using the development version of data.table (1.8.11), then there's a more efficient version (in terms of memory) renamed as uniqlist, that does exactly the same job. Probably this should be exported for next release. Seems to have come up on SO more than once. Let's see.

require(data.table) # 1.8.11
data.table:::uniqlist(x[, 3:5])
# [1] 1 4 5

Totally unreadable, but:

c(1,which(rowSums(sapply(x[,grep('val',names(x))],diff))!=0)+1)
# [1] 1 4 5

Basically, run diff on each row, to find all the changes. If a change occurs in any column, then a change has occurred in the row.

Also, without the sapply:

c(1,which(rowSums(diff(as.matrix(x[,grep('val',names(x))])))!=0)+1)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!