问题
I want to separate each string in the vector into columns but I can't do it!
library(tidyr)
library(dplyr)
df <- data.frame(x = c("abe", "bas", "dds", "eer"))
df %>% separate(x, c("A", "B", "C"), sep=1)
The output I want looks like this
A B C
1 a b e
2 b a s
3 d d s
4 e e r
That sep=1 works for 2 characters but doesn't work for 3. I was hoping a regex like sep="." or sep="[a-z]" would work too but it doesn't.
This is probably super easy but I'm new to R. Won't someone please help!
回答1:
You were quite close with your own solution. Simply add a second position for the sep argument.
So:
library(tidyr)
library(dplyr)
df <- data.frame(x = c("abe", "bas", "dds", "eer"))
df %>% separate(x, c("A", "B", "C"), sep = c(1,2))
A B C
1 a b e
2 b a s
3 d d s
4 e e r
回答2:
Method 1
Use a positive lookbehind with separate
:
df %>%
separate(x, c("A", "B", "C"), sep = "(?<=.)", extra = "drop")
# A B C
#1 a b e
#2 b a s
#3 d d s
#4 e e r
Note that this will only work if every string x
consists of exactly three characters.
Method 2
Use strsplit
:
df %>%
mutate(tmp = strsplit(as.character(x), "")) %>%
unnest() %>%
group_by(x) %>%
mutate(n = 1:n()) %>%
spread(n, tmp) %>%
ungroup() %>%
select(-x)
## A tibble: 4 x 3
# `1` `2` `3`
# <chr> <chr> <chr>
#1 a b e
#2 b a s
#3 d d s
#4 e e r
This will also allow for strings x
of varying lengths, by padding columns with NA
s if necessary.
回答3:
In spite you want a non R-base solution, here's an R base approach just for the record.
> x <- data.frame(do.call(rbind, strsplit(as.character(df$x), "")))
> names(x) <- LETTERS[1:3]
> x
A B C
1 a b e
2 b a s
3 d d s
4 e e r
来源:https://stackoverflow.com/questions/49887440/how-do-i-separate-every-character-in-a-string-in-a-vector-into-a-column-using-ti