data.table

Comparing values of a certain row with a certain number of previous rows in data.table

陌路散爱 提交于 2020-08-05 10:11:46
问题 This is an extension of this question asked before. In a database containing firm and category values, I want to calculate this: If a firm enters into a new category that it has not been previously engaged in Three(3) previous years (not including the same year), then that entry is labeld as "NEW", otherwise it will be labeld as "OLD". In the following dataset: df <- data.table(year=c(1979,1979,1980,1980,1981,1981,1982,1983,1983,1984,1984), category = c("A","A","B","C","A","D","F","F","C","A"

Grouped mean of difftime fails in data.table

爱⌒轻易说出口 提交于 2020-08-02 08:39:15
问题 Preface: I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group] When printing the new data.table, I see values such as 1.925988e+00 days 1.143287e+00 days 1.453975e+01 days I would like to limit the decimal place values for this column only (i.e. not setting options() unless I can do this specifically for difftime values this way). When I try to do this

Grouped mean of difftime fails in data.table

你离开我真会死。 提交于 2020-08-02 08:36:55
问题 Preface: I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group] When printing the new data.table, I see values such as 1.925988e+00 days 1.143287e+00 days 1.453975e+01 days I would like to limit the decimal place values for this column only (i.e. not setting options() unless I can do this specifically for difftime values this way). When I try to do this

Grouped mean of difftime fails in data.table

为君一笑 提交于 2020-08-02 08:36:32
问题 Preface: I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group] When printing the new data.table, I see values such as 1.925988e+00 days 1.143287e+00 days 1.453975e+01 days I would like to limit the decimal place values for this column only (i.e. not setting options() unless I can do this specifically for difftime values this way). When I try to do this

Grouped mean of difftime fails in data.table

余生颓废 提交于 2020-08-02 08:36:29
问题 Preface: I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group] When printing the new data.table, I see values such as 1.925988e+00 days 1.143287e+00 days 1.453975e+01 days I would like to limit the decimal place values for this column only (i.e. not setting options() unless I can do this specifically for difftime values this way). When I try to do this

list data.tables in memory and combine by row (rbind)

老子叫甜甜 提交于 2020-07-30 10:23:29
问题 I have many data.tables in memory with names following a specific pattern (e.g.: RE_1, RE_2... CO_1, CO_2...). I want to bind them efficiently to get only two data.tables (RE and CO). I tried: RE <- rbindlist(ls(pattern = "RE")) But I got the following error: "Error in rbindlist(ls(pattern = "RE")) : Input to rbindlist must be a list of data.tables". Is there a way to make such a "usable" list of data.tables (or data frames)? 回答1: Try rbindlist(lapply(ls(pattern = "RE"),get)) Dont know if

get rows of unique values by group

倖福魔咒の 提交于 2020-07-30 05:27:18
问题 I have a data.table and want to pick those lines of the data.table where some values of a variable x are unique relative to another variable y It's possible to get the unique values of x, grouped by y in a separate dataset, like this dt[,unique(x),by=y] But I want to pick the rows in the original dataset where this is the case. I don't want a new data.table because I also need the other variables. So, what do I have to add to my code to get the rows in dt for which the above is true? dt <-

Tidyverse approach to binding unnamed list of unnamed vectors by row - do.call(rbind,x) equivalent

左心房为你撑大大i 提交于 2020-07-28 14:16:26
问题 I often find questions where people have somehow ended up with an unnamed list of unnamed character vectors and they want to bind them row-wise into a data.frame . Here is an example: library(magrittr) data <- cbind(LETTERS[1:3],1:3,4:6,7:9,c(12,15,18)) %>% split(1:3) %>% unname data #[[1]] #[1] "A" "1" "4" "7" "12" # #[[2]] #[1] "B" "2" "5" "8" "15" # #[[3]] #[1] "C" "3" "6" "9" "18" One typical approach is with do.call from base R. do.call(rbind, data) %>% as.data.frame # V1 V2 V3 V4 V5 #1

Tidyverse approach to binding unnamed list of unnamed vectors by row - do.call(rbind,x) equivalent

女生的网名这么多〃 提交于 2020-07-28 14:14:31
问题 I often find questions where people have somehow ended up with an unnamed list of unnamed character vectors and they want to bind them row-wise into a data.frame . Here is an example: library(magrittr) data <- cbind(LETTERS[1:3],1:3,4:6,7:9,c(12,15,18)) %>% split(1:3) %>% unname data #[[1]] #[1] "A" "1" "4" "7" "12" # #[[2]] #[1] "B" "2" "5" "8" "15" # #[[3]] #[1] "C" "3" "6" "9" "18" One typical approach is with do.call from base R. do.call(rbind, data) %>% as.data.frame # V1 V2 V3 V4 V5 #1

Comparing value of a certain row with all previous rows in data.table

谁说胖子不能爱 提交于 2020-07-21 03:02:08
问题 I'm having a dataset containing firms involving in a certain category of products. Dataset looks like this: df <- data.table(year=c(1979,1979,1980,1980,1980,1981,1981,1982,1982,1982,1982), category = c("A","A","B","C","A","D","C","F","F","A","B")) I want to create a new variable as follows: If a firm enters into a new category that it has not been previously engaged in previous years (not the same year) , then that entry is labeld as "NEW", otherwise it will be labeld as "OLD". As such, the