问题
I have two R dataframes I want to merge. In straight R you can do:
cost <- data.frame(farm=c('farm A', 'office'), cost=c(10, 100))
trees <- data.frame(farm=c('farm A', 'farm B'), trees=c(20,30))
merge(cost, trees, all=TRUE)
which produces:
farm cost trees
1 farm A 10 20
2 office 100 NA
3 farm B NA 30
I am using dplyr
, and would prefer a solution such as:
left_join(cost, trees)
which produces something close to what I want:
farm cost trees
1 farm A 10 20
2 office 100 NA
In dplyr
I can see left_join
, inner_join
, semi_join
and anti-join
, but none of these does what merge
with all=TRUE
does.
Also - is there a quick way to set the NAs to 0? My efforts so far using x$trees[is.na(x$trees)] <- 0;
are laborious (I need a command per column) and don't always seem to work.
thanks
回答1:
The most recent version of dplyr
(0.4.0) now has a full_join option, which is what I believe you want.
cost <- data.frame(farm=c('farm A', 'office'), cost=c(10, 100))
trees <- data.frame(farm=c('farm A', 'farm B'), trees=c(20,30))
merge(cost, trees, all=TRUE)
Returns
> merge(cost, trees, all=TRUE)
farm cost trees
1 farm A 10 20
2 office 100 NA
3 farm B NA 30
And
library(dplyr)
full_join(cost, trees)
Returns
> full_join(cost, trees)
Joining by: "farm"
farm cost trees
1 farm A 10 20
2 office 100 NA
3 farm B NA 30
Warning message:
joining factors with different levels, coercing to character vector
回答2:
library(plyr)
> dat <- join(cost, trees, type = "full")
Joining by: farm
> dat
farm cost trees
1 farm A 10 20
2 office 100 NA
3 farm B NA 30
> dat[is.na(dat)] <- 0
> dat
farm cost trees
1 farm A 10 20
2 office 100 0
3 farm B 0 30
来源:https://stackoverflow.com/questions/21841146/is-there-an-r-dplyr-method-for-merge-with-all-true