Filtering multiple csv files while importing into data frame

前端未结

关注

 3  1064

攒了一身酷 2021-01-03 11:18

I have a large number of csv files that I want to read into R. All the column headings in the csvs are the same. But I want to import only those rows from each file into the

3条回答

轻奢々 (楼主)

2021-01-03 11:32
Here is an approach using data.table which will allow you to use fread (which is faster than read.csv) and rbindlist which is a superfast implementation of do.call(rbind, list(..)) perfect for this situation. It also has a function between
```
library(data.table)
fileNames <- list.files(path = workDir)
alldata <- rbindlist(lapply(fileNames, function(x,mon,max) {
  xx <- fread(x, sep = ',')
  xx[, fileID :=   gsub(".csv.*", "", x)]
  xx[between(v3, lower=min, upper = max, incbounds = FALSE)]
  }, min = 2, max = 3))
```
If the individual files are large and v1 always integer values it might be worth setting v3 as a key then using a binary search, it may also be quicker to import everything and then run the filtering.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...