问题
I have a bit of code used successfully before. It should import all files in a given directory into a single dataset. I have a new folder of data, and now I can't get it to work. The individual files will read in. List.files will also show all files in the folder. So I'm at a loss.
list.files('../data/')
[1] "B101-2.txt" "B101.txt" "B116.txt" "B6.txt" "B65.txt" "B67-2.txt" "B67.txt"
[8] "B70.txt" "B71-2.txt" "B71.txt" "B95-2.txt" "B95.txt" "B96-2.txt" "B96.txt"
[15] "B98-2.txt" "B98.txt" "B99-2.txt" "B99.txt"
a = ldply(
.data = list.files(
path = '../data/'
)
, .fun = function(x){
from_header = scan(x,n = 1,skip=1,quiet=T)
to_return = read.table(
file = x
, skip = 20
, sep = '\t'
, fill = TRUE
)
to_return$condition = from_header[1]
return(to_return)
}
, .progress = 'text'
)
Error in file(file, "r") : cannot open the connection In addition: Warning message: In file(file, "r") : cannot open file 'B101-2.txt': No such file or directory
回答1:
Specifying the full path name can be done in list.files directly.
list.files(path = '../data', full.names = TRUE)
Note the omission of / in the path specification. If left in, the files would be listed as ..data//B101-2.txt, which would fail.
TEST Simulating a file structure you note in Tim Biegeleisen's answer:
library(plyr)
dir.create("analysis")
dir.create("data")
write.table(matrix(c(1:57,1:6), ncol=3, byrow=T), file="data/test1.txt", sep="\t", row.names=F, quote=F)
write.table(matrix(c(2:58,7:12), ncol=3, byrow=T), file="data/test2.txt", sep="\t", row.names=F, quote=F)
write.table(matrix(c(3:59,13:18), ncol=3, byrow=T), file="data/test3.txt", sep="\t", row.names=F, quote=F)
We now run your code from within the analysis folder.
setwd("analysis")
a = ldply(
.data = list.files(path = '../data', full.names = TRUE)
, .fun = function(x){
from_header = scan(x,n = 1,skip=1,quiet=T)
to_return = read.table(file = x, skip = 20, sep = '\t', fill = TRUE)
to_return$condition = from_header[1]
return(to_return)
}
, .progress = 'text'
)
The code reads in all three files and outputs lines 21-22 for each.
a
V1 V2 V3 condition
1 1 2 3 1
2 4 5 6 1
3 7 8 9 2
4 10 11 12 2
5 13 14 15 3
6 16 17 18 3
回答2:
list.files('../data/') is showing you output like B101-2.txt etc. for the files, but from your very call to list.files() you can see that the relative path is ../data/. So if you were to try to do, for example, read.csv(file="B101-2.txt") it would fail. Instead, the call should be read.csv(file="../data/B101-2.txt"). The solution to your problem is to use the full relative path necessary to address the files.
Use this as the first argument to ldply():
.data = paste0("../data/", list.files(path = '../data/'))
The key thing to take away from this is that list.files() returns a list of filenames (with extensions), not the full or relative path to those files.
来源:https://stackoverflow.com/questions/38006597/r-files-not-found-when-call-to-import-folder-but-open-as-individual-files-and