How do I select a subset of rows after group by a specific column in R Data table [duplicate]

别等时光非礼了梦想. 提交于 2020-01-11 10:28:48

问题


I want to select a subset of rows based upon a condition after grouping by a specific column in R data table . Take Mtcars data for an example .

dt_mtcars <- as.data.table(mtcars)


dt_mtcars[,.N,by=.(hp)]

    hp  N
 1: 110 3
 2:  93 1
 3: 175 3
 4: 105 1
 5: 245 2
 6:  62 1
 7:  95 1
 8: 123 2
 9: 180 3
10: 205 1
11: 215 1
12: 230 1
13:  66 2
14:  52 1
15:  65 1
16:  97 1
17: 150 2
18:  91 1
19: 113 1
20: 264 1
21: 335 1
22: 109 1
     hp N

Now what I am trying to achieve is like selecting a sub set of data based on the count for e.g in here I want to have all the data in dt_mtcars where N >1 if I try something like this (which is conceptually what I am trying to achieve) it shows all the rows rather than getting the rows where N is greater than 1 .

dt_mtcars[.N>1,.SD,by=.(hp)]

Please assist .


回答1:


We can do this using if

dt_mtcars[, if(.N>1) .SD ,hp]



回答2:


You can do it in two steps:

dt_mtcars[,Count:=.N, by=.(hp)][Count>1, .SD, .SDcols = rev(rev(1:length(names(dt_mtcars)))[-1])]

EDIT: realized mtcars does not have just those two rows, updated SDcols

EDIT: Fixed to remove count



来源:https://stackoverflow.com/questions/34233417/how-do-i-select-a-subset-of-rows-after-group-by-a-specific-column-in-r-data-tabl

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!