问题
I'm trying to merge a directory full of comma delimited text files using R, while also incorporating the file name of each file as a new variable in the data set.
I've been using the following:
library(plyr)
file_list <- list.files()
dataset <- ldply(file_list, read.table, header=FALSE, sep=",")
Can anyone shed any light on how I'd add the file name for each file read as a new variable within dataset?
Many thanks,
-Jon
回答1:
You can just make a wrapper around the read.table()
function that adds in your filename variable. Something like this should work:
read.data <- function(file){
dat <- read.table(file,header=F,sep=",")
dat$fname <- file
return(dat)
}
Once there you just need to apply that function across your data files. Since you didn't post any example data I'm not sure what it actually looks like, but for now I'll assume it's clean as can be and that rbind()
is sufficient to join them together, in which case this example should illustrate that function in action:
> data(iris)
> write.csv(iris,file="iris1.csv",row.names=F)
> write.csv(iris,file="iris2.csv",row.names=F)
> dataset <- do.call(rbind, lapply(list.files(pattern="csv$"),read.data))
> head(dataset)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species fname
1 5.1 3.5 1.4 0.2 setosa iris1.csv
2 4.9 3.0 1.4 0.2 setosa iris1.csv
3 4.7 3.2 1.3 0.2 setosa iris1.csv
4 4.6 3.1 1.5 0.2 setosa iris1.csv
5 5.0 3.6 1.4 0.2 setosa iris1.csv
6 5.4 3.9 1.7 0.4 setosa iris1.csv
> table(dataset$fname)
iris1.csv iris2.csv
150 150
来源:https://stackoverflow.com/questions/17369435/merging-files-and-file-names-in-r