问题
I'd like to use data.table
to do some wrangling and would like my resulting data table to not include the grouping variable.
Here's a MWE:
library("data.table")
DT <- data.table(x = 1:10, grp = rep(1:2,5))
DT[, .(mmm = mean(x)), by = grp]
This produces:
grp mmm
1: 1 5
2: 2 6
which is all fine. However, I'd prefer the grp
not to be here. This can be fixed by chaining the data.table
calls and setting grp := NULL
or just throwing the variable away, but can I prevent it in the first call so I only return mmm
?
回答1:
It isn't clear why you don't want to use this. Using DT[, .(mmm = mean(x)), by = grp][, grp := NULL][]
would be my first choice.
Although I won't advise it, you can also use:
DT[, .(mmm = DT[, .(mmm = mean(x)), by = grp]$mmm)]
which will give you the desired result as well:
mmm 1: 5 2: 6
Although you will get the same result, it is better not to use this method. The major drawback of this is that you will make your code unnecessary complicated when you want to summarise more than value column. You would then get something like:
DT[, .(mx = DT[, .(mx = mean(x)), by = grp]$mx, my = DT[, .(my = mean(y)), by = grp]$my)]
while using the normal data.table-way would be:
DT[, .(mx = mean(x), my = mean(y)), by = grp][, grp := NULL][]
To conclude:
Using the DT[, .(mmm = mean(x)), by = grp][, grp := NULL][]
method would thus be your best choice.
来源:https://stackoverflow.com/questions/47497386/remove-grouping-variable-for-data-table