data.table | 易学教程

Efficiently merging two data frames on a non-trivial criteria

阅读更多关于 Efficiently merging two data frames on a non-trivial criteria

问题 Answering this question last night, I spent a good hour trying to find a solution that didn't grow a data.frame in a for loop, without any success, so I'm curious if there's a better way to go about this problem. The general case of the problem boils down to this: Merge two data.frames Entries in either data.frame can have 0 or more matching entries in the other. We only care about entries that have 1 or more matches across both. The match function is complex involving multiple columns in

Efficiently merging two data frames on a non-trivial criteria

阅读更多关于 Efficiently merging two data frames on a non-trivial criteria

Subsetting data.table set by date range in R

阅读更多关于 Subsetting data.table set by date range in R

问题 I have a large dataset in data.table that I'd like to subset by a date range. My data set looks like this: testset <- data.table(date=as.Date(c("2013-07-02","2013-08-03","2013-09-04", "2013-10-05","2013-11-06")), yr = c(2013,2013,2013,2013,2013), mo = c(07,08,09,10,11), da = c(02,03,04,05,06), plant = LETTERS[1:5], product = as.factor(letters[26:22]), rating = runif(25)) I'd like to be able to choose a date range directly from the as.Date column without using the yr , mo , or da columns.

Any way to force fread() of data.table not to stop on empty lines?

阅读更多关于 Any way to force fread() of data.table not to stop on empty lines?

问题 (question is not relevant anymore, since new version of data.table of 25-NOV-2016 - see accepted answer below) So, I have a table with some empty lines in the middle. When I try to open it with fread , it stops, saying Stopped reading at empty line 10006, but text exists afterwards (discarded) . Is there any way to avoid this without changing the data file? 回答1: Version 1.9.8 of data.table, released 25-NOV-2016, has a new blank.lines.skip option to skip blank lines. text <- "1,a\n\n2,b\n3,c

Using data.table i and j arguments in functions

阅读更多关于 Using data.table i and j arguments in functions

问题 I am trying to write some wrapper functions to reduce code duplication with data.table . Here is an example using mtcars . First, set up some data: library(data.table) data(mtcars) mtcars$car <- factor(gsub("(.*?) .*", "\\1", rownames(mtcars)), ordered=TRUE) mtcars <- data.table(mtcars) Now, here is what I would usually write to get a summary of counts by group. In this case I am grouping by car : mtcars[, list(Total=length(mpg)), by="car"][order(car)] car Total AMC 1 Cadillac 1 Camaro 1 ...

Using data.table i and j arguments in functions

阅读更多关于 Using data.table i and j arguments in functions

Extract row corresponding to minimum value of a variable by group

阅读更多关于 Extract row corresponding to minimum value of a variable by group

问题 I wish to (1) group data by one variable ( State ), (2) within each group find the row of minimum value of another variable ( Employees ), and (3) extract the entire row. (1) and (2) are easy one-liners, and I feel like (3) should be too, but I can't get it. Here is a sample data set: > data State Company Employees 1 AK A 82 2 AK B 104 3 AK C 37 4 AK D 24 5 RI E 19 6 RI F 118 7 RI G 88 8 RI H 42 data <- structure(list(State = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("AK", "RI")

Extract row corresponding to minimum value of a variable by group

阅读更多关于 Extract row corresponding to minimum value of a variable by group

left join in data.table [duplicate]

阅读更多关于 left join in data.table [duplicate]

问题 This question already has answers here : Left join using data.table (2 answers) Closed 8 months ago . I am trying to do left join in data.table , I want to join panelFull and panel on the basis of OutletID . From panel I want CellID column to be inserted in panelFull : > panel[1:15,] Period CellID OutletID ACV 1: 215 1268 M44600 9563317 2: 215 1268 M44800 8966339 3: 215 1268 M45100 7043924 4: 215 1268 M45200 9013918 5: 215 1268 M45300 10009468 6: 215 1268 M46900 22148703 7: 215 1268 M48400

left join in data.table [duplicate]

阅读更多关于 left join in data.table [duplicate]