Filtering multiple csv files while importing into data frame

前端 未结 3 1055
攒了一身酷
攒了一身酷 2021-01-03 11:18

I have a large number of csv files that I want to read into R. All the column headings in the csvs are the same. But I want to import only those rows from each file into the

3条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-03 11:32

    Here is an approach using data.table which will allow you to use fread (which is faster than read.csv) and rbindlist which is a superfast implementation of do.call(rbind, list(..)) perfect for this situation. It also has a function between

    library(data.table)
    fileNames <- list.files(path = workDir)
    alldata <- rbindlist(lapply(fileNames, function(x,mon,max) {
      xx <- fread(x, sep = ',')
      xx[, fileID :=   gsub(".csv.*", "", x)]
      xx[between(v3, lower=min, upper = max, incbounds = FALSE)]
      }, min = 2, max = 3))
    

    If the individual files are large and v1 always integer values it might be worth setting v3 as a key then using a binary search, it may also be quicker to import everything and then run the filtering.

提交回复
热议问题