I\'ve got a column people$food
that has entries like chocolate
or apple-orange-strawberry
.
I want to split people$food
dplyr/magrittr
approach:
library(magrittr)
library(dplyr)
word = c('apple-orange-strawberry', 'chocolate')
strsplit(word, "-") %>% sapply(extract2, 1)
# [1] "apple" "chocolate"
I would use sub()
instead. Since you want the first "word" before the split, we can simply remove everything after the first -
and that's what we're left with.
sub("-.*", "", people$food)
Here's an example -
x <- c("apple", "banana-raspberry-cherry", "orange-berry", "tomato-apple")
sub("-.*", "", x)
# [1] "apple" "banana" "orange" "tomato"
Otherwise, if you want to use strsplit()
you can round up the first elements with vapply()
vapply(strsplit(x, "-", fixed = TRUE), "[", "", 1)
# [1] "apple" "banana" "orange" "tomato"
I would suggest using head
rather than [
in R.
word <- c('apple-orange-strawberry','chocolate')
sapply(strsplit(word, "-"), head, 1)
# [1] "apple" "chocolate"
For example
word <- 'apple-orange-strawberry'
strsplit(word, "-")[[1]][1]
[1] "apple"
or, equivalently
unlist(strsplit(word, "-"))[1].
Essentially the idea is that split
gives a list as a result, whose elements have to be accessed either by slicing (the former case) or by unlisting (the latter).
If you want to apply the method to an entire column:
first.word <- function(my.string){
unlist(strsplit(my.string, "-"))[1]
}
words <- c('apple-orange-strawberry', 'orange-juice')
R: sapply(words, first.word)
apple-orange-strawberry orange-juice
"apple" "orange"
If you need to extract the first (or nth
) entry from each split, use:
word <- c('apple-orange-strawberry','chocolate')
sapply(strsplit(word,"-"), `[`, 1)
#[1] "apple" "chocolate"
Or faster and more explictly:
vapply(strsplit(word,"-"), `[`, 1, FUN.VALUE=character(1))
#[1] "apple" "chocolate"
Both bits of code will cope well with selecting whichever value in the split list, and will deal with cases that are outside the range:
vapply(strsplit(word,"-"), `[`, 2, FUN.VALUE=character(1))
#[1] "orange" NA