grouping table by multiple factors and spreading it from long format to wide - the data.table way in R

感情迁移 提交于 2020-03-25 13:41:36

问题


As an example i will be using the mtcars data available in R:

data(mtcars)
setDT(mtcars)

Lets day I want to group the data by three variables, namely: carb, cyl, and gear. I have done this as follow. However, i am sure there is a better way, as this is quite repetitive.

newDTcars <- mtcars [, mtcars[, mtcars[, .N , by = carb], by = cyl], by= gear]

Secondly, I would like to have the data in a wide format, where there is a separate column for every gear level. For illustration purpose I have done this using tidyr, however i would like to have this done the "data.table" way.

newDTcars %>% tidyr::spread(gear, N)

The emphasis of this question is to keep to solution to the data.table world, as i would like too learn more about data.table.


回答1:


In data.table, we can group by multiple columns and to reshape we can use dcast.

library(data.table)
dcast(mtcars[, .N, .(carb, cyl, gear)], carb+cyl~gear, value.var = "N")

#   carb cyl  3  4  5
#1:    1   4  1  4 NA
#2:    1   6  2 NA NA
#3:    2   4 NA  4  2
#4:    2   8  4 NA NA
#5:    3   8  3 NA NA
#6:    4   6 NA  4 NA
#7:    4   8  5 NA  1
#8:    6   6 NA NA  1
#9:    8   8 NA NA  1

You may use fill argument in dcast to replace NAs with 0 or any other number.



来源:https://stackoverflow.com/questions/60772372/grouping-table-by-multiple-factors-and-spreading-it-from-long-format-to-wide-t

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!