I have a data table with one string column. I'd like to create another column that is a subset of this column using strsplit.
dat <- data.table(labels=c('a_1','b_2','c_3','d_4'))
The output I want is
label sub_label
a_1 a
b_2 b
c_3 c
d_4 d
I've tried the followings but neither seems to work.
dat %>%
mutate(
sub_labels=strsplit(as.character(labels), "_")[[1]][1]
)
# gives a column whose values are all "a"
this one, which seems logical to me,
dat %>%
mutate(
sub_labels=sapply(strsplit(as.character(labels), "_"), function(x) x[[1]][1])
)
gives an error
Error: Don't know how to handle type pairlist
I saw another post where paste-collapse on the output from strsplit worked so I don't understand why subsetting in an anonymous function is giving issues. Thanks for any elucidation on this.
tidyr::separate
can help here:
> dat %>% separate(labels, c("first", "second") )
first second
1: a 1
2: b 2
3: c 3
4: d 4
Another method uses purrr's map_chr, which I've found useful for applications where I didn't want to bother with separating and uniting (e.g. using the results in a sprintf with other strings):
tibble(labels=c('a_1','b_2','c_3','d_4')) %>%
mutate(sub_label = str_split(labels, "_") %>% map_chr(., 1))
来源:https://stackoverflow.com/questions/42565539/using-strsplit-and-subset-in-dplyr-and-mutate