rbindlist

Make rbindlist skip, ignore or change class attribute of the column

耗尽温柔 提交于 2019-12-23 06:52:20
问题 I would like to merge a large set of dataframes (about 30), which each have about 200 variables. These datasets are very much alike but not identical. Please find two example dataframes below: library(data.table) library(haven) df1 <- fread( "A B C iso year 0 B 1 NLD 2009 1 A 2 NLD 2009 0 Y 3 AUS 2011 1 Q 4 AUS 2011 0 NA 7 NLD 2008 1 0 1 NLD 2008 0 1 3 AUS 2012", header = TRUE ) df2 <- fread( "A B D E iso year 0 1 1 NA ECU 2009 1 0 2 0 ECU 2009 0 0 3 0 BRA 2011 1 0 4 0 BRA 2011 0 1 7 NA ECU

Why is rbindlist “better” than rbind?

点点圈 提交于 2019-12-17 02:03:27
问题 I am going through documentation of data.table and also noticed from some of the conversations over here on SO that rbindlist is supposed to be better than rbind . I would like to know why is rbindlist better than rbind and in which scenarios rbindlist really excels over rbind ? Is there any advantage in terms of memory utilization? 回答1: rbindlist is an optimized version of do.call(rbind, list(...)) , which is known for being slow when using rbind.data.frame Where does it really excel Some

Binding dataframes in list after data cleaning on list

一笑奈何 提交于 2019-12-13 04:01:09
问题 This is a follow up on my last question (Rbinding large list of dataframes after I did some data cleaning on the list). I've gotten smarter and the former question got messy. I have 43 xlsx files which I loaded in to a list in R: file.list <- list.files(recursive=T,pattern='*.xlsx') dat = lapply(file.list, function(i){ x = read_xlsx(i, sheet=1, col_names = T) # Create column with file name x$file = i # Return data x }) I then added some column names: my_names <- c("ID", "UDLIGNNR","BILAGNR",

Why is rbindlist “better” than rbind?

百般思念 提交于 2019-11-26 11:03:17
I am going through documentation of data.table and also noticed from some of the conversations over here on SO that rbindlist is supposed to be better than rbind . I would like to know why is rbindlist better than rbind and in which scenarios rbindlist really excels over rbind ? Is there any advantage in terms of memory utilization? mnel rbindlist is an optimized version of do.call(rbind, list(...)) , which is known for being slow when using rbind.data.frame Where does it really excel Some questions that show where rbindlist shines are Fast vectorized merge of list of data.frames by row