问题
I am reading an apache logfile using read.table and am wondering if it's somehow possible to apply a function (i.e. strptime ) while the data are being imported, instead of post-processing it.
More details as requested: The column containing the date has the format:
[10/Nov/2011:06:25:14
I can successfully parse it using:
strptime(red[1,4],format="[%d/%b/%Y:%H:%M:%S")
or
as.POSIXct(strptime(red[1,4],format="[%d/%b/%Y:%H:%M:%S"))
but
as.POSIXct(red[1,4],format="[%d/%b/%Y:%H:%M:%S")
fails. Hence I cannot use POSIXct in colClasses AFAIK.
回答1:
If there is an as.
method you can use colClasses with that class. Since Date is a class and has a default format of YYYY-MM-DD, if your dates are in that format, you could just include Date in the colClasses
vector. It is also possible to define new as.function
's. As always, the more detail you supply about the problem, the better the answer.
library(methods)
setClass("logDate")
#[1] "logDate"
setAs("character", "logDate", function(from)
as.POSIXct(from, format="[%d/%b/%Y:%H:%M:%S"))
DF <- read.table(text="[10/Nov/2011:06:25:14", header = FALSE,
colClasses = c("logDate"))
str(DF)
#'data.frame': 1 obs. of 1 variable:
# $ V1: POSIXct, format: "2011-11-10 06:25:14"
Should probably give Gabor Grothendieck some credit since he is the one who showed me how to do this 5 years ago: https://www.stat.math.ethz.ch/pipermail/r-help/2007-April/130912.html
回答2:
Instead of this, probably, you could define the desired format for Your log. So don't need to postprocess Your data, if it will come already in good format.
LogFormat "%h %l %u %t \"%r\" %>s %b" common
CustomLog logs/access_log common
http://httpd.apache.org/docs/2.0/logs.html#accesslog http://httpd.apache.org/docs/2.0/mod/mod_log_config.html
来源:https://stackoverflow.com/questions/8081451/read-table-and-apply-functions-to-a-column