问题
I'm working on a data.table that includes X and Y columns and I want to create a new column Z which is the number of all records with the same value of (X, Y).
I know the syntax when working with a data.frame:
ddply(df,.(X,Y),nrow)
I tested different syntaxes I found on this forum but they didn't work:
dt[, Z := lapply(.SD,nrow), by="X,Y"] # or
dt[, `:=`(Z = lapply(.SD,nrow)), by="X,Y"]
I precise X and Y are numeric.
回答1:
Starting from
library(data.table)
dt <- data.table(X = c(1, 1, 2), Y = c(1, 1, 2))
The appropriate syntax is
dt[, Z := .N, by = c("X","Y")]
or
dt[, Z := .N, by = .(X,Y)]
来源:https://stackoverflow.com/questions/46134936/create-a-new-column-in-a-data-table-from-group-by-multiple-columns