Combine two or more columns in a dataframe into a new column with a new name

谁说我不能喝 提交于 2019-11-26 07:29:43

问题


For example if I have this:

n = c(2, 3, 5) 
s = c(\"aa\", \"bb\", \"cc\") 
b = c(TRUE, FALSE, TRUE) 
df = data.frame(n, s, b)

  n  s     b
1 2 aa  TRUE
2 3 bb FALSE
3 5 cc  TRUE

Then how do I combine the two columns n and s into a new column named x such that it looks like this:

  n  s     b     x
1 2 aa  TRUE  2 aa
2 3 bb FALSE  3 bb
3 5 cc  TRUE  5 cc

回答1:


Use paste.

 df$x <- paste(df$n,df$s)
 df
#   n  s     b    x
# 1 2 aa  TRUE 2 aa
# 2 3 bb FALSE 3 bb
# 3 5 cc  TRUE 5 cc



回答2:


For inserting a separator:

df$x <- paste(df$n, "-", df$s)



回答3:


As already mentioned in comments by Uwe and UseR, a general solution in the tidyverse format would be to use the command unite:

library(tidyverse)

n = c(2, 3, 5) 
s = c("aa", "bb", "cc") 
b = c(TRUE, FALSE, TRUE) 

df = data.frame(n, s, b) %>% 
  unite(x, c(n, s), sep = " ", remove = FALSE)



回答4:


Some examples with NAs and their removal using apply

n = c(2, NA, NA) 
s = c("aa", "bb", NA) 
b = c(TRUE, FALSE, NA) 
c = c(2, 3, 5) 
d = c("aa", NA, "cc") 
e = c(TRUE, NA, TRUE) 
df = data.frame(n, s, b, c, d, e)

paste_noNA <- function(x,sep=", ") {
gsub(", " ,sep, toString(x[!is.na(x) & x!="" & x!="NA"] ) ) }

sep=" "
df$x <- apply( df[ , c(1:6) ] , 1 , paste_noNA , sep=sep)
df



回答5:


Using dplyr::mutate:

library(dplyr)
df <- mutate(df, x = paste(n, s)) 

df 
> df
  n  s     b    x
1 2 aa  TRUE 2 aa
2 3 bb FALSE 3 bb
3 5 cc  TRUE 5 cc



回答6:


Instead of

  • paste (non-tidy),
  • paste0 (defauld separator) or
  • unite (constrained to 2 columns and 1 separator),

I'd suggest a more flexible anternative: stringr::str_c

library("tidyverse")
df %>% mutate(x=str_c(n,"-",s,".",b))
#> # A tibble: 3 x 4
#>       n s     b     x         
#>   <dbl> <fct> <lgl> <chr>     
#> 1     2 aa    TRUE  2-aa.TRUE 
#> 2     3 bb    FALSE 3-bb.FALSE
#> 3     5 cc    TRUE  5-cc.TRUE 



回答7:


We can use paste0:

df$combField <- paste0(df$x, df$y)

If you do not want any padding space introduced in the concatenated field. This is more useful if you are planning to use the combined field as a unique id that represents combinations of two fields.



来源:https://stackoverflow.com/questions/18115550/combine-two-or-more-columns-in-a-dataframe-into-a-new-column-with-a-new-name

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!