问题
How to add a named vector to a data frame, with the components of the vector reordered according to the column names of the data frame?
I need to build a data frame one row at a time. A named vector is obtained by some processing and it provides the values for the row to be inserted. Problem is the named vector doesn't have components in the same order as data frame columns. This makes rbind
produce wrong result. Here is the very simplified sample code:
df = data.frame(id=1:2, va=11:12, vb=21:22, vc=31:32)
v1 = c(id=4, va=14, vb=25, vc=NA)
df = rbind(df, v1)
So far, so good as this produces correct result. Now the next vector processing leads to:
v2 = c(va=19, id=9, vc=34, vb=NA)
df = rbind(df, v2)
This produces incorrect result. The correct result should be
id va vb vc
1 1 11 21 31
2 2 12 22 32
3 4 14 25 NA
4 9 19 NA 34
回答1:
Make a data frame out of v2
prior to the rbind
:
rbind(df, as.data.frame(t(v2)))
## id va vb vc
## 1 1 11 21 31
## 2 2 12 22 32
## 3 4 14 25 NA
## 4 9 19 NA 34
Here is why this works:
v2
has names, but it acts like a column vector to as.data.frame
:
as.data.frame(v2)
## v2
## va 19
## id 9
## vc 34
## vb NA
Thus, you must transpose the data to put it into the correct form:
as.data.frame(t(v2))
## va id vc vb
## 1 19 9 34 NA
回答2:
You could reorder the vector
rbind(df, v2[names(df)])
id va vb vc
1 1 11 21 31
2 2 12 22 32
3 9 19 NA 34
library(microbenchmark)
microbenchmark(rbind(df, v2[names(df)]),
rbind(df, as.data.frame(t(v2))), times = 10000)
Unit: microseconds
expr min lq median uq max neval
rbind(df, v2[names(df)]) 212.773 219.305 222.572 294.895 15300.96 10000
rbind(df, as.data.frame(t(v2))) 374.219 382.618 387.750 516.067 39951.31 10000
来源:https://stackoverflow.com/questions/22581122/how-to-add-a-named-vector-as-a-row-to-a-data-frame-reordered-according-to-colum