问题
On the web,i found that rbind()
is used to combine two data frames and the same task is performed by bind_rows()
function.
Then i don't understand what is the difference between these two functions and which is more efficient to use ??
回答1:
Apart from few more differences, one of the main reasons for using bind_rows
over rbind
is to combine two data frames having different number of columns. rbind
throws an error in such a case whereas bind_rows
assigns "NA
" to those rows of columns missing in one of the data frames where the value is not provided by the data frames.
Try out the following code to see the difference:
a <- data.frame(a = 1:2, b = 3:4, c = 5:6)
b <- data.frame(a = 7:8, b = 2:3, c = 3:4, d = 8:9)
Results for the two calls are as follows:
rbind(a, b)
> rbind(a, b)
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
library(dplyr)
bind_rows(a, b)
> bind_rows(a, b)
a b c d
1 1 3 5 NA
2 2 4 6 NA
3 7 2 3 8
4 8 3 4 9
回答2:
Although bind_rows()
is more functional in the sense that it will combine data frames with different numbers of columns (assigning NA
to rows with those columns missing), if you are combining data frames with the same columns, I would recommend rbind()
.
rbind()
is much more computationally efficient in cases where the data you are combining are formatted the same way, and it simply throws an error when the number of columns is different. It will save you a lot of time for big data sets. I would highly recommend rbind()
for these situations. Nonetheless, if your data has different columns, then you have to use bind_rows()
.
来源:https://stackoverflow.com/questions/42887217/difference-between-rbind-and-bind-rows-in-r