data.table | 易学教程

Should I use mget(), .. or with=FALSE to select columns of a data.table?

阅读更多关于 Should I use mget(), .. or with=FALSE to select columns of a data.table?

问题 There are multiple ways to select columns of data.table by using a variable holding the desired column names ( with=FALSE , .. , mget , ...). Is there a consensus which to use (when)? Is one more data.table -y than the others? I could come up with the following arguments: with=FALSE and .. are almost equally fast, while mget is slower .. can't select concatenated column names "on the fly" ( EDIT : current CRAN version 1.12.8 definitely can, I was using an old version, which could not, so this

Should I use mget(), .. or with=FALSE to select columns of a data.table?

阅读更多关于 Should I use mget(), .. or with=FALSE to select columns of a data.table?

remove duplicates and collapse near duplicates based on time difference

阅读更多关于 remove duplicates and collapse near duplicates based on time difference

问题 I have a data-frame like as shown below DF = structure(list(Age_visit = c(48, 48, 48, 49, 49, 77), Date_1 = c("8/6/2169 9:40", "8/6/2169 9:40", "8/6/2169 9:41", "8/6/2169 9:42", "24/7/2169 8:31", "12/9/2169 10:30", "19/6/2237 12:15"), Date_2 = c("NA-NA-NA NA:NA:NA", "NA-NA-NA NA:NA:NA", "NA-NA-NA NA:NA:NA", "NA-NA-NA NA:NA:NA", "NA-NA-NA NA:NA:NA", "NA-NA-NA NA:NA:NA", "NA-NA-NA NA:NA:NA"), person_id = c("21", "21", "21", "21", "21", "21", "31" ), enc_id = c("A21BC","A21BC", "A22BC", "A23BC",

remove duplicates and collapse near duplicates based on time difference

阅读更多关于 remove duplicates and collapse near duplicates based on time difference

Create new data.table columns based on other columns

阅读更多关于 Create new data.table columns based on other columns

问题 I have a data.table containing some state name abbreviations and county names. I want to get approx. coordinates from ggplot2::map_data('county') for each row. I can do this sequentially with multiple lines of code using := but I would like to make only one function call. Below is what I've tried: Data: library(data.table) library(ggplot2) > dput(dt[1:20, .(state, county, prime_mover)]) structure(list(state = c("AZ", "AZ", "CA", "CA", "CA", "CT", "FL", "IN", "MA", "MA", "MA", "MN", "NJ", "NJ"

Create new data.table columns based on other columns

阅读更多关于 Create new data.table columns based on other columns

Create new data.table columns based on other columns

阅读更多关于 Create new data.table columns based on other columns

nested loops through a structured list in R

阅读更多关于 nested loops through a structured list in R

问题 I have an example dataset, garden , as shown below. The real thing is thousands of rows. I also have an example list. productFruit . I want to know the calories of every fruit , considering the usage reported in garden . I basically want to loop through all the rows in my table, check if the usage is recorded in the productFruit list and the return either the calories or one of the following error messages: "usage out of scope" if no usage has been found in the productFruit list "fruit out of

Assigning/Referencing a column name in data.table dynamically (in i, j and by)

阅读更多关于 Assigning/Referencing a column name in data.table dynamically (in i, j and by)

问题 A) Instead of this (where cars <- data.table(cars) ) cars[ , .(`Totals:`=.N), by=speed] I need this strColumnName <- "Totals:" cars [ , strColumnName = .N, by=speed] How to do it? B) Similarly (more general case) - instead of this: cars[ dist > 50, .(`Totals:`=.N, x=dist*100), by=speed] I need this: strFactor <- "dist" cars[ strFactor > 50, .(`Totals:`=.N, x=strFactor*100), by=speed] This question is about GENERAL way of assigning/referencing column name variables in data.table, i.e. in 'j'

How to replace column with strings with look-up codes in R

阅读更多关于 How to replace column with strings with look-up codes in R

问题 Imagine that I have a dataframe or datatable with strings column where one row looks like this: a1; b: b1, b2, b3; c: c1, c2, c3; d: d1, d2, d3, d4 and a look-up table with codes for mapping each of these strings. For example: string code a1 10 b1 20 b2 30 b3 40 c1 50 c2 60 ... I would like to have a mapping function that maps this string to code: 10; b: 20, 30, 40; c: 50, 60, 70; d: 80, 90, 100 I have a column of these strings in data.table/data.frame (more tha 100k) so any quick solution