Separate string into many columns

元气小坏坏 提交于 2021-02-16 18:23:06

问题


I'd like to separate each letter or symbol in a string for composing a new data.frame with dimension equals the number of letters. I tried to use the function separate from tidyr package, but the result is not desired.

df <- data.frame(x = c('house', 'mouse'), y = c('count', 'apple'), stringsAsFactors = F)

unexpected result

df[1, ] %>% separate(x, c('A1', 'A2', 'A3', 'A4', 'A5'), sep ='')
    A1   A2   A3   A4   A5     y
1 <NA> <NA> <NA> <NA> <NA> count

Expected output

A1  A2  A3  A4  A5
 h   o   u   s   e
 m   o   u   s   e

Solutions using stringr are welcome.


回答1:


We can use regex lookaround in sep to match the boundary between each character

library(dplyr)
library(tidyr)
library(stringr)
df %>%
   select(x) %>% 
   separate(x, into = str_c("A", 1:5), sep= "(?<=[a-z])(?=[a-z])")
#  A1 A2 A3 A4 A5
#1  h  o  u  s  e
#2  m  o  u  s  e



回答2:


A solution in base would be:

do.call(rbind , sapply(df$x, function(col) strsplit(col, "")))

 #       [,1] [,2] [,3] [,4] [,5]
 # house "h"  "o"  "u"  "s"  "e" 
 # mouse "m"  "o"  "u"  "s"  "e" 



回答3:


We can use cSplit from splitstackshape with stripWhite = FALSE and sep = "" to split every letter in a column.

splitstackshape::cSplit(df, "x", sep = "", stripWhite = FALSE)

#       y x_1 x_2 x_3 x_4 x_5
#1: count   h   o   u   s   e
#2: apple   m   o   u   s   e


来源:https://stackoverflow.com/questions/59166057/separate-string-into-many-columns

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!