I have two data frames that have some columns with the same names and others with different names. The data frames look something like this:
df1
ID hel
Here is a more tidyr
centric approach that does something similar to the currently accepted answer. The approach is simply to stack the data frames on top of each other with bind_rows
(which matches column names), gather
up all the non ID
columns with na.rm = TRUE
, and then spread
them back out. This should be robust to situations where the condition "if the value is NA in "df1" it would have a value in "df2" (and vice versa)" doesn't always hold, compared to a summarise
option.
library(tidyverse)
df1 <- structure(list(ID = 1:5, hello = c(NA, NA, 10L, 4L, NA), world = c(NA, NA, 8L, 17L, NA), hockey = c(7L, 2L, 8L, 5L, 3L), soccer = c(4L, 5L, 23L, 12L, 43L)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(ID = structure(list(), class = c("collector_integer", "collector")), hello = structure(list(), class = c("collector_integer", "collector")), world = structure(list(), class = c("collector_integer", "collector")), hockey = structure(list(), class = c("collector_integer", "collector")), soccer = structure(list(), class = c("collector_integer", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))
df2 <- structure(list(ID = 1:5, hello = c(2L, 5L, NA, NA, 9L), world = c(3L, 1L, NA, NA, 7L), football = c(43L, 24L, 2L, 5L, 12L), baseball = c(6L, 32L, 23L, 15L, 2L)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(ID = structure(list(), class = c("collector_integer", "collector")), hello = structure(list(), class = c("collector_integer", "collector")), world = structure(list(), class = c("collector_integer", "collector")), football = structure(list(), class = c("collector_integer", "collector")), baseball = structure(list(), class = c("collector_integer", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))
df1 %>%
bind_rows(df2) %>%
gather(variable, value, -ID, na.rm = TRUE) %>%
spread(variable, value)
#> # A tibble: 5 x 7
#> ID baseball football hello hockey soccer world
#> <int> <int> <int> <int> <int> <int> <int>
#> 1 1 6 43 2 7 4 3
#> 2 2 32 24 5 2 5 1
#> 3 3 23 2 10 8 23 8
#> 4 4 15 5 4 5 12 17
#> 5 5 2 12 9 3 43 7
Created on 2018-07-13 by the reprex package (v0.2.0).