Suffixes when merging more than two data frames with full_join

人盡茶涼 提交于 2021-02-11 14:51:12

问题


I would like to used nested full_join to merge several data frames together. In addition, I am hoping to be able to add suffixes to all of the columns so that when the data frames are merged each column name indicates which data frame it came from (e.g., a unique time identifier like T1, T2, ...).

x <- data.frame(i = c("a","b","c"), j = 1:3, h = 1:3, stringsAsFactors=FALSE)
y <- data.frame(i = c("b","c","d"), k = 4:6, h = 1:3, stringsAsFactors=FALSE)
z <- data.frame(i = c("c","d","a"), l = 7:9, h = 1:3, stringsAsFactors=FALSE)

full_join(x, y, by='i') %>% left_join(., z, by='I')

Is there a way to integrate the default suffix option so that I get a dataset with column names that look like:

column_names <- c("i", "j_T1", "h_T1", "k_T2", "h_T2", "l_T3", "h_T3")

回答1:


I think this can be done by working with the column headers using purrr but I've used pivot_wider and pivot_longer to change the header names:

df <- x %>% 
  full_join(y, by = "i") %>% 
  full_join(z, by = "i") %>% 
  pivot_longer(cols = -i,
               names_to = "columns",
               values_to = "values") %>% # makes the column headers into a column 
which can be changed
  mutate(columns = str_replace(columns, ".x", "_T2"),
         columns = str_replace(columns, ".y", "_T3"),
         columns = case_when(!str_detect(columns, "T") ~ paste0(columns, "_T1"),
                             TRUE ~ columns)) %>% 
  pivot_wider(names_from = columns,
              values_from = values)

These don't match the listed headers but hopefully this code will help to get you started if the order is important and column l should be T3 (there was only 1 in this example).



来源:https://stackoverflow.com/questions/65152352/suffixes-when-merging-more-than-two-data-frames-with-full-join

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!