问题
I've tried something like this
file_in <- file("myfile.log","r")
x <- readLines(file_in, n=-100)
but I'm still waiting...
Any help would be greatly appreciated
回答1:
I'd use scan for this, in case you know how many lines the log has :
scan("foo.txt",sep="\n",what="char(0)",skip=100)
If you have no clue how many you need to skip, you have no choice but to move towards either
- reading in everything and taking the last n lines (in case that's feasible),
- using
scan("foo.txt",sep="\n",what=list(NULL))to figure out how many records there are, or - using some algorithm to go through the file, keeping only the last n lines every time
The last option could look like :
ReadLastLines <- function(x,n,...){
con <- file(x)
open(con)
out <- scan(con,n,what="char(0)",sep="\n",quiet=TRUE,...)
while(TRUE){
tmp <- scan(con,1,what="char(0)",sep="\n",quiet=TRUE)
if(length(tmp)==0) {close(con) ; break }
out <- c(out[-1],tmp)
}
out
}
allowing :
ReadLastLines("foo.txt",100)
or
ReadLastLines("foo.txt",100,skip=1e+7)
in case you know you have more than 10 million lines. This can save on the reading time when you start having extremely big logs.
EDIT : In fact, I'd not even use R for this, given the size of your file. On Unix, you can use the tail command. There is a windows version for that as well, somewhere in a toolkit. I didn't try that out yet though.
回答2:
You could do this with read.table by specifying the skip parameter. If your lines are not to be parsed to variables, specify the separator to be '\n' as @Joris Meys pointed out below, and also set as.is=TRUE to get character vectors instead of factors.
Small example (skipping the first 2000 lines):
df <- read.table('foo.txt', sep='\n', as.is=TRUE, skip=2000)
回答3:
As @JorisMeys already mentioned the unix command tail would be the easiest way to solve this problem. However I want to propose a seek based R solution that starts reading the file from the end of the file:
tailfile <- function(file, n) {
bufferSize <- 1024L
size <- file.info(file)$size
if (size < bufferSize) {
bufferSize <- size
}
pos <- size - bufferSize
text <- character()
k <- 0L
f <- file(file, "rb")
on.exit(close(f))
while(TRUE) {
seek(f, where=pos)
chars <- readChar(f, nchars=bufferSize)
k <- k + length(gregexpr(pattern="\\n", text=chars)[[1L]])
text <- paste0(text, chars)
if (k > n || pos == 0L) {
break
}
pos <- max(pos-bufferSize, 0L)
}
tail(strsplit(text, "\\n")[[1L]], n)
}
tailfile(file, n=100)
回答4:
For seeing the last few lines:
tail(file_in,100)
来源:https://stackoverflow.com/questions/5596107/reading-the-last-n-lines-from-a-huge-text-file