r - find same times in n number of data frames

拜拜、爱过 提交于 2019-12-13 06:22:00

问题


Consider the following example:

Date1 = seq(from = as.POSIXct("2010-05-03 00:00"), 
            to = as.POSIXct("2010-06-20 23:00"), by = 120)
Dat1 <- data.frame(DateTime = Date1,
                   x1 = rnorm(length(Date1)))

Date2 <- seq(from = as.POSIXct("2010-05-01 03:30"), 
             to = as.POSIXct("2010-07-03 22:00"), by = 120)
Dat2 <- data.frame(DateTime = Date2,
                   x1 = rnorm(length(Date2)))

Date3 <- seq(from = as.POSIXct("2010-06-08 01:30"), 
             to = as.POSIXct("2010-07-13 11:00"), by = 120)
Dat3Matrix <- matrix(data = rnorm(length(Date3)*3), ncol = 3)

Dat3 <- data.frame(DateTime = Date3,
                   x1 = Dat3Matrix)

list1 <- list(Dat1,Dat2,Dat3)

Here I build three data.frames as an example and placed them all into a list. From here I would like to write a routine that would return the 3 data frames but only keeping the times that were present in each of the others i.e. all three data frames should be reduced to the times that were consistent among all of the data frames. How can this be done?


回答1:


zoo has a multi-way merge. This lapply's read.zoo over the components of list1 converting them each to zoo class. tz="" tells it to use POSIXct for the resulting date/times. It then merges the converted components using all=FALSE so that only intersecting times are kept.

library(zoo)
z <- do.call("merge", c(lapply(setNames(list1, 1:3), read.zoo, tz = ""), all = FALSE))

If we later wish to convert z to data.frame try dd <- cbind(Time = time(z), coredata(z)) but it might be better to keep it as a zoo object (or convert it to an xts object) so that further processing is simplified as well.




回答2:


One approach is to find the respective indices and then subset accordingly:

idx1 <- (Dat1[,1] %in% Dat2[,1]) & (Dat1[,1] %in% Dat3[,1])
idx2 <- (Dat2[,1] %in% Dat1[,1]) & (Dat2[,1] %in% Dat3[,1])
idx3 <- (Dat3[,1] %in% Dat1[,1]) & (Dat3[,1] %in% Dat2[,1])

Now Dat1[idx1,], Dat2[idx2,], Dat3[idx3,] should give the desired result.




回答3:


You could use merge:

res <- NULL
for (i in 2:length(list1)) {
  dat <- list1[[i]]
  names(dat)[2] <- paste0(names(dat)[2], "_", i);
  dat[[paste0("id_", i)]] <- 1:nrow(dat)

  if (is.null(res)) {
    res <- dat
  } else {
    res <- merge(res, dat, by="DateTime")
  }
}

I added columns with id's; you could use these to index the records in the original data.frames



来源:https://stackoverflow.com/questions/16385909/r-find-same-times-in-n-number-of-data-frames

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!