问题
I can join two datasets that contain two variables with different names using dplyr::left_join(..., by = c("name1" = "name2").
I want to join using character objects, left_join(..., by = c(nameOb1 = nameOb2). Oddly: this works for by = c("name1", nameOb2), but not for by = c(nameOb1, "name2").
Why is this?
Replication of my issue below. Many thanks.
Generate data
orig <- tibble(name1 = c("a", "b", "c"),
n = c(10, 20, 30))
tojoin <- tibble(name2 = c("a", "b", "c"),
pc = c(.4, .1, .2))
Works: using character strings for the by arguments
left_join(orig, tojoin, by = c("name1" = "name2"))
# A tibble: 3 x 3
name1 n pc
<chr> <dbl> <dbl>
1 a 10 0.4
2 b 20 0.1
3 c 30 0.2
Does not work: using object as the character string for the first by argument
firstname <- "name1"
left_join(orig, tojoin, by = c(firstname = "name2"))
# Error: `by` can't contain join column `firstname` which is missing from LHS
# Call `rlang::last_error()` to see a backtrace
Works: using object as the character string for the second by argument
secondname <- "name2"
left_join(orig, tojoin, by = c("name1" = secondname))
# A tibble: 3 x 3
name1 n pc
<chr> <dbl> <dbl>
1 a 10 0.4
2 b 20 0.1
3 c 30 0.2
Packages:
dplyr 0.8.0.1
回答1:
Hy, the 'left_join' function needs a named character vector in the by argument. In your second try:
firstname <- "name1"
left_join(orig, tojoin, by = c(firstname = "name2"))
You set the name of the character vector to firstname which does not work for the join.
For solving this you can first generate a named character vector and pass it then to the by argument of the join function
firstname <- "name1"
join_cols = c("name2")
names(join_cols) <- firstname
dplyr::left_join(orig, tojoin, by = join_cols)
来源:https://stackoverflow.com/questions/54823846/dplyr-left-join-does-not-work-with-a-character-objects-as-the-lhs-variable