I am trying to LEFT Join 2 data frames but I do not want join all the variables from the second data set:
As an example, I have dataset 1 (DF1):
Cl
Nothing elegant but this could be another satisfactory answer.
merge(x = DF1, y = DF2, by = "Client", all.x=TRUE)[,c("Client","LO","CON")]
This will be useful especially when you don't need the keys that were used to join the tables in your results.
You can do this by subsetting the data you pass into your merge:
merge(x = DF1, y = DF2[ , c("Client", "LO")], by = "Client", all.x=TRUE)
Or you can simply delete the column after your current merge :)
I think it's a little simpler to use the dplyr functions select and left_join ; at least it's easier for me to understand. The join function from dplyr are made to mimic sql arguments.
library(tidyverse)
DF2 <- DF2 %>%
select(client, LO)
joined_data <- left_join(DF1, DF2, by = "Client")
You don't actually need to use the "by" argument in this case because the columns have the same name.