data.table | 易学教程

How to sort a data.table using vector of multiple columns

阅读更多关于 How to sort a data.table using vector of multiple columns

问题 I am pretty new to R and trying to build a function to compare two data set, in order to do that I need to sort data table on multiple columns.I am sure there will be some help somewhere but I am not sure how to search for it. This is my approach so far: DT = data.table(x=rep(c("b","a","c"),each=3), y=c(1,3,6), v=1:9) #column vector keycol <- c("x","y") DT[order(keycol)] x y v 1: b 1 1 2: b 3 2 Somehow It displays just 2 rows and removes other records.But if I do this: > DT[order(x,y)] x y v

How to sort a data.table using vector of multiple columns

阅读更多关于 How to sort a data.table using vector of multiple columns

How to sort a data.table using vector of multiple columns

阅读更多关于 How to sort a data.table using vector of multiple columns

Order data.table by a character vector of column names

阅读更多关于 Order data.table by a character vector of column names

问题 I'd like to order a data.table by a variable holding the name of a column: I've tried every combination of + eval , get and c` without success: I have colVar = "someColumnName" I'd like to apply this to: DT[order(colVar)] 回答1: You can use double brackets for data tables: library(data.table) dtbl <- data.table(x = 1:5, y = 5:1) colVar = "y" dtbl_sorted <- dtbl[order(dtbl[[colVar]])] dtbl_sorted 回答2: data.table has special functions for that matter which will modify your data set by reference

data.table | faster row-wise recursive update within group

阅读更多关于 data.table | faster row-wise recursive update within group

问题 I have to do the following recursive row-by-row operation to obtain z : myfun = function (xb, a, b) { z = NULL for (t in 1:length(xb)) { if (t >= 2) { a[t] = b[t-1] + xb[t] } z[t] = rnorm(1, mean = a[t]) b[t] = a[t] + z[t] } return(z) } set.seed(1) n_smpl = 1e6 ni = 5 id = rep(1:n_smpl, each = ni) smpl = data.table(id) smpl[, time := 1:.N, by = id] a_init = 1; b_init = 1 smpl[, ':=' (a = a_init, b = b_init)] smpl[, xb := (1:.N)*id, by = id] smpl[, z := myfun(xb, a, b), by = id] I would like

data.table | faster row-wise recursive update within group

阅读更多关于 data.table | faster row-wise recursive update within group

Use a character vector in the `by` argument

阅读更多关于 Use a character vector in the `by` argument

问题 Within the data.table package in R, is there a way in order to use a character vector to be assigned within the by argument of the calculation? Here is an example of what would be the desired output from this using mtcars: mtcars <- data.table(mtcars) ColSelect <- 'cyl' # One Column Option mtcars[,.( AveMpg = mean(mpg)), by = .(ColSelect)] # Doesn't work # Desired Output cyl AveMpg 1: 6 19.74286 2: 4 26.66364 3: 8 15.10000 I know that this is possible to use assigning column names in j by

Use a character vector in the `by` argument

阅读更多关于 Use a character vector in the `by` argument

Shorten nested ifelse

阅读更多关于 Shorten nested ifelse

问题 If the following data table is given, and we would like to compare x1 consequently with x2 to x5, the following can be used: set.seed(1) library(data.table) TDT <- data.table(x1 = round(rnorm(100,0.75,0.3),2), x2 = round(rnorm(100,0.75,0.3),2), x3 = round(rnorm(100,0.75,0.3),2), x4 = round(rnorm(100,0.75,0.3),2), x5 = round(rnorm(100,0.75,0.3),2)) TDT[,compare := ifelse(x1 < x2,1,ifelse(x1 < x3,2,ifelse(x1 < x4,3,ifelse(x1 < x5,4,5))))] So if x1 < x2 , then compare == 1 , etc. Now in my

R data.table filtering on group size

阅读更多关于 R data.table filtering on group size

问题 I am trying to find all the records in my data.table for which there is more than one row with value v in field f . For instance, we can use this data: dt <- data.table(f1=c(1,2,3,4,5), f2=c(1,1,2,3,3)) If looking for that property in field f2 , we'd get (note the absence of the (3,2) tuple) f1 f2 1: 1 1 2: 2 1 3: 4 3 4: 5 3 My first guess was dt[.N>2,list(.N),by=f2] , but that actually keeps entries with .N==1 . dt[.N>2,list(.N),by=f2] f2 N 1: 1 2 2: 2 1 3: 3 2 The other easy guess, dt