问题
Consider the following example:
Date1 = seq(from = as.POSIXct("2010-05-03 00:00"),
to = as.POSIXct("2010-06-20 23:00"), by = 120)
Dat1 <- data.frame(DateTime = Date1,
x1 = rnorm(length(Date1)))
Date2 <- seq(from = as.POSIXct("2010-05-01 03:30"),
to = as.POSIXct("2010-07-03 22:00"), by = 120)
Dat2 <- data.frame(DateTime = Date2,
x1 = rnorm(length(Date2)))
Date3 <- seq(from = as.POSIXct("2010-06-08 01:30"),
to = as.POSIXct("2010-07-13 11:00"), by = 120)
Dat3Matrix <- matrix(data = rnorm(length(Date3)*3), ncol = 3)
Dat3 <- data.frame(DateTime = Date3,
x1 = Dat3Matrix)
list1 <- list(Dat1,Dat2,Dat3)
Here I build three data.frames as an example and placed them all into a list. From here I would like to write a routine that would return the 3 data frames but only keeping the times that were present in each of the others i.e. all three data frames should be reduced to the times that were consistent among all of the data frames. How can this be done?
回答1:
zoo
has a multi-way merge. This lapply's read.zoo
over the components of list1
converting them each to zoo class. tz=""
tells it to use POSIXct for the resulting date/times. It then merges the converted components using all=FALSE
so that only intersecting times are kept.
library(zoo)
z <- do.call("merge", c(lapply(setNames(list1, 1:3), read.zoo, tz = ""), all = FALSE))
If we later wish to convert z
to data.frame try dd <- cbind(Time = time(z), coredata(z))
but it might be better to keep it as a zoo object (or convert it to an xts object) so that further processing is simplified as well.
回答2:
One approach is to find the respective indices and then subset accordingly:
idx1 <- (Dat1[,1] %in% Dat2[,1]) & (Dat1[,1] %in% Dat3[,1])
idx2 <- (Dat2[,1] %in% Dat1[,1]) & (Dat2[,1] %in% Dat3[,1])
idx3 <- (Dat3[,1] %in% Dat1[,1]) & (Dat3[,1] %in% Dat2[,1])
Now Dat1[idx1,]
, Dat2[idx2,]
, Dat3[idx3,]
should give the desired result.
回答3:
You could use merge
:
res <- NULL
for (i in 2:length(list1)) {
dat <- list1[[i]]
names(dat)[2] <- paste0(names(dat)[2], "_", i);
dat[[paste0("id_", i)]] <- 1:nrow(dat)
if (is.null(res)) {
res <- dat
} else {
res <- merge(res, dat, by="DateTime")
}
}
I added columns with id's; you could use these to index the records in the original data.frames
来源:https://stackoverflow.com/questions/16385909/r-find-same-times-in-n-number-of-data-frames