问题
This question already has an answer here:
- Matching multiple patterns 6 answers
I am using grepl() in R to search if either of the following Genres exist in my text. I am doing it like this right now:
grepl(\"Action\", my_text) |
grepl(\"Adventure\", my_text) |
grepl(\"Animation\", my_text) |
grepl(\"Biography\", my_text) |
grepl(\"Comedy\", my_text) |
grepl(\"Crime\", my_text) |
grepl(\"Documentary\", my_text) |
grepl(\"Drama\", my_text) |
grepl(\"Family\", my_text) |
grepl(\"Fantasy\", my_text) |
grepl(\"Film-Noir\", my_text) |
grepl(\"History\", my_text) |
grepl(\"Horror\", my_text) |
grepl(\"Music\", my_text) |
grepl(\"Musical\", my_text) |
grepl(\"Mystery\", my_text) |
grepl(\"Romance\", my_text) |
grepl(\"Sci-Fi\", my_text) |
grepl(\"Sport\", my_text) |
grepl(\"Thriller\", my_text) |
grepl(\"War\", my_text) |
grepl(\"Western\", my_text)
Is there a better way to write this code? Can I put all the genres in an array and then somehow use grepl()
on that?
回答1:
You could paste the genres together with an "or" |
separator and run that through grepl
as a single regular expression.
x <- c("Action", "Adventure", "Animation", ...)
grepl(paste(x, collapse = "|"), my_text)
Here's an example.
x <- c("Action", "Adventure", "Animation")
my_text <- c("This one has Animation.", "This has none.", "Here is Adventure.")
grepl(paste(x, collapse = "|"), my_text)
# [1] TRUE FALSE TRUE
回答2:
You can cycle through a list or vector of genres, as below:
genres <- c("Action",...,"Western")
sapply(genres, function(x) grepl(x, my_text))
To answer your question, if you just want to know if any
element of the result is TRUE you can use the any()
function.
any(sapply(genres, function(x) grepl(x, my_text)))
Quite simply, if any element of is TRUE, any
will return TRUE.
来源:https://stackoverflow.com/questions/26319567/use-grepl-to-search-either-of-multiple-substrings-in-a-text