问题
My data.frame DATA is
k l g
1 A 2004 12
2 B 2004 3.4
3 C 2004 4.5
Another data.frame DATA2 is
i d t
1 A 2012 22
2 B 2012 4.8
3 C 2012 5.6
I want to get
1 A 2004 12
1 A 2012 22
2 B 2004 3.4
2 B 2012 4.8
3 C 2004 4.5
3 C 2012 5.6
回答1:
We can try rbindlist from data.table. Place the datasets in a list, rbind them with rbindlist and order by the first column.
library(data.table)
rbindlist(list(df1, df2))[order(k)]
# k l g
#1: A 2004 12.0
#2: A 2012 22.0
#3: B 2004 3.4
#4: B 2012 4.8
#5: C 2004 4.5
#6: C 2012 5.6
Or using dplyr
library(dplyr)
bind_rows(df1, setNames(df2, names(df1))) %>%
arrange(k)
NOTE: I used df1 and df2 in place of DATA and DATA2 as object names as it is easier to type.
回答2:
You can try the interleave function from the "gdata" package. However, this would require your inputs have the same column names and have the same number of rows.
The approach would be:
library(gdata) # for interleave
do.call(interleave, lapply(list(df1, df2), setNames, paste0("V", 1:ncol(df1))))
# V1 V2 V3
# 1 A 2004 12.0
# 11 A 2012 22.0
# 2 B 2004 3.4
# 21 B 2012 4.8
# 3 C 2004 4.5
# 31 C 2012 5.6
Alternatively, as mentioned in my comment @akrun's answer, depending on whether the first column is a grouping variable or not, you may want to modify his approach a little.
For instance, imagine there were a third data.frame, with a different number of rows than the others. interleave would not work on that, but the rbindlist approach would.
df3 <- do.call(rbind, lapply(list(df1, df2), setNames, c("A", "B", "Z")))
rbindlist(list(df1, df2, df3), idcol = TRUE)[, N := sequence(.N), by = .id][order(N)]
# .id k l g N
# 1: 1 A 2004 12.0 1
# 2: 2 A 2012 22.0 1
# 3: 3 A 2004 12.0 1
# 4: 1 B 2004 3.4 2
# 5: 2 B 2012 4.8 2
# 6: 3 B 2004 3.4 2
# 7: 1 C 2004 4.5 3
# 8: 2 C 2012 5.6 3
# 9: 3 C 2004 4.5 3
# 10: 3 A 2012 22.0 4
# 11: 3 B 2012 4.8 5
# 12: 3 C 2012 5.6 6
Pay specific attention to the last three rows in comparison with @akrun's approach.
The equivalent in base R for that last "data.table" approach would be something like:
x <- do.call(rbind, lapply(c("df1", "df2", "df3"), function(x) {
setNames(cbind(rn = x, get(x)), c("id", paste0("V", 1:ncol(get(x)))))
}))
x[order(ave(as.numeric(x$id), x$id, FUN = seq_along)), ]
(So the moral is, use "data.table".)
回答3:
You can also do this in base R with rbind without the need to use extra packages, but you will have to set the columnnames of df2 the same as the columnnames in df1:
colnames(df2) <- colnames(df1) # or: setNames(df2, colnames(df1))
new.df <- rbind(df1,df2)
new.df <- new.df[order(new.df$k),]
This will result in the following dataframe:
> new.df
k l g
1 A 2004 12.0
11 A 2012 22.0
2 B 2004 3.4
21 B 2012 4.8
3 C 2004 4.5
31 C 2012 5.6
来源:https://stackoverflow.com/questions/35356989/merge-two-data-frames-to-get-alternate-rows-of-each-data-frame-in-sequence