Splitting rows with uneven string length into columns in R using tidyr [duplicate]

荒凉一梦 提交于 2019-12-01 18:43:22

You specify the target columns correctly:

library(tidyr)
separate(DF, V1, paste0("X",1:8), sep="-")

which gives:

      X1     X2     X3     X4     X5     X6     X7     X8
1 Place1 Place2 Place2 Place4 Place2 Place3 Place5   <NA>
2 Place7 Place7 Place7 Place7 Place7 Place7 Place7 Place7
3 Place1 Place1 Place1 Place1 Place3 Place5   <NA>   <NA>
4 Place1 Place4 Place2 Place3 Place3 Place5 Place5   <NA>
5 Place6 Place6   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>
6 Place1 Place2 Place3 Place4   <NA>   <NA>   <NA>   <NA>

If you don't know how many target columns you need beforehand, you can use:

> max(sapply(strsplit(as.character(DF$V1),'-'),length))
[1] 8

to extract the maximum number of parts (which is thus the number of columns you need).


Several other methods:

splitstackshape :

library(splitstackshape)
cSplit(DF, "V1", sep="-", direction = "wide")

stringi :

library(stringi)
as.data.frame(stri_list2matrix(stri_split_fixed(DF$V1, "-"), byrow = TRUE))

data.table :

library(data.table)
setDT(DF)[, paste0("v", 1:8) := tstrsplit(V1, "-")][, V1 := NULL][]

stringr :

library(stringr)
as.data.frame(str_split_fixed(DF$V1, "-",8))

which all give a similar result.


Used data:

DF <- data.frame(V1=c("Place1-Place2-Place2-Place4-Place2-Place3-Place5",
                      "Place7-Place7-Place7-Place7-Place7-Place7-Place7-Place7",
                      "Place1-Place1-Place1-Place1-Place3-Place5",
                      "Place1-Place4-Place2-Place3-Place3-Place5-Place5",
                      "Place6-Place6",
                      "Place1-Place2-Place3-Place4"))
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!