data.table | 易学教程

Fit model by group using Data.Table package

阅读更多关于 Fit model by group using Data.Table package

问题 How can I fit multiple models by group using data.table syntax? I want my output to be a data.frame with columns for each "by group" and one column for each model fit. Currently I am able to do this using the dplyr package, but can't do this in data.table. # example data frame df <- data.table( id = sample(c("id01", "id02", "id03"), N, TRUE), v1 = sample(5, N, TRUE), v2 = sample(round(runif(100, max = 100), 4), N, TRUE) ) # equivalent code in dplyr group_by(df, id) %>% do( model1= lm(v1 ~v2,

Fit model by group using Data.Table package

阅读更多关于 Fit model by group using Data.Table package

Conditional (inequality) join in data.table

阅读更多关于 Conditional (inequality) join in data.table

问题 I'm just trying to figure out how to do a conditional join on two data.tables. I've written a sqldf conditional join to give me the circuits whose start or finish times are within the other's start/finish times. sqldf("select dt2.start, dt2.finish, dt2.counts, dt1.id, dt1.circuit from dt2 left join dt1 on ( (dt2.start >= dt1.start and dt2.start < dt1.finish) or (dt2.finish >= dt1.start and dt2.finish < dt1.finish) )") This gives me the correct result, but it's too slow for my large-ish data

R data.table Multiple Conditions Join

阅读更多关于 R data.table Multiple Conditions Join

问题 I’ve devised a solution to lookup values from multiple columns of two separate data tables and add a new column based calculations of their values (multiple conditional comparisons). Code below. It involves using a data.table and join while calculating values from both tables, however, the tables aren’t joined on the columns I’m comparing, and therefore I suspect I may not be getting the speed advantages inherent to data.tables that I’ve read so much about and am excited about tapping into.

R data.table Multiple Conditions Join

阅读更多关于 R data.table Multiple Conditions Join

How to cut a vector or column into intervals in R [duplicate]

阅读更多关于 How to cut a vector or column into intervals in R [duplicate]

问题 This question already has answers here : Convert continuous numeric values to discrete categories defined by intervals (2 answers) Closed 1 year ago . I have the following columns in a dataframe which difference between each row is 0.012 s : Time 0 0.012 0.024 0.036 0.048 0.060 0.072 0.084 0.096 0.108 I want to come up with intervals starting from beginning increasing by 0.030, so intervals or time window of every 0.03 later to be used in group by. 回答1: You can try findInterval like

R data.table - sample by group with different sampling proportion

阅读更多关于 R data.table - sample by group with different sampling proportion

问题 I would like to efficiently make a random sample by group from a data.table , but it should be possible to sample a different proportion for each group. If I wanted to sample fraction sampling_fraction from each group, i could get inspired by this question and related answer to do something like: DT = data.table(a = sample(1:2), b = sample(1:1000,20)) group_sampler <- function(data, group_col, sample_fraction){ # this function samples sample_fraction <0,1> from each group in the data.table #

R data.table - sample by group with different sampling proportion

阅读更多关于 R data.table - sample by group with different sampling proportion

Apply a rolling function by group in r (zoo, data.table)

阅读更多关于 Apply a rolling function by group in r (zoo, data.table)

问题 I am having trouble doing something fairly simple: apply a rolling function (standard deviation) by group in a data.table. My problem is that when I use a data.table with rollapply by some column, data.table recycles the observations as noted in the warning message below. I would like to get NAs for the observations that are outside of the window instead of recycling the standard deviations. This is my approach so far using iris, and a rolling window of size 2, aligned to the right: library

Apply a rolling function by group in r (zoo, data.table)

阅读更多关于 Apply a rolling function by group in r (zoo, data.table)