stringr | 易学教程

How to escape a backslash in R? [duplicate]

阅读更多关于 How to escape a backslash in R? [duplicate]

This question already has an answer here: How to escape backslashes in R string 3 answers I'm working in R and having troubles escaping the backslash. I am using the library stringr . install.packages("stringr", repos='http://cran.us.r-project.org') library("stringr") I would like to do str = str_replace_all(str, "\", "") So I tried str = str_replace_all(str, "\\", "") but it won't work. What should I do? I found a solution that works str = gsub("([\\])","", str) Use Hmisc::escapeRegex and Hmisc::escapeBS which automatically escapes backslashes and other regex special characters. 来源： https:/

Remove all text between two brackets

阅读更多关于 Remove all text between two brackets

Suppose I have some text like this, text<-c("[McCain]: We need tax policies that respect the wage earners and job creators. [Obama]: It's harder to save. It's harder to retire. [McCain]: The biggest problem with American healthcare system is that it costs too much. [Obama]: We will have a healthcare system, not a disease-care system. We have the chance to solve problems that we've been talking about... [Text on screen]: Senators McCain and Obama are talking about your healthcare and financial security. We need more than talk. [Obama]: ...year after year after year after year. [Announcer]: Call

Non-greedy string regular expression matching

阅读更多关于 Non-greedy string regular expression matching

问题 I'm pretty sure I'm missing something obvious here, but I cannot make R to use non-greedy regular expressions: > library(stringr) > str_match('xxx aaaab yyy', "a.*?b") [,1] [1,] "aaaab" Base functions behave the same way: > regexpr('a.*?b', 'xxx aaaab yyy') [1] 5 attr(,"match.length") [1] 5 attr(,"useBytes") [1] TRUE I would expect the match to be just ab as per 'greedy' comment in http://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html: By default repetition is greedy, so the

Extracting a string between other two strings in R

阅读更多关于 Extracting a string between other two strings in R

I am trying to find a simple way to extract an unknown substring (could be anything) that appear between two known substrings. For example, I have a string: a<-" anything goes here, STR1 GET_ME STR2, anything goes here" I need to extract the string GET_ME which is between STR1 and STR2 (without the white spaces). I am trying str_extract(a, "STR1 (.+) STR2") , but I am getting the entire match [1] "STR1 GET_ME STR2" I can of course strip the known strings, to isolate the substring I need, but I think there should be a cleaner way to do it by using a correct regular expression. You may use str

Extract last 4-digit number from a series in R using stringr

阅读更多关于 Extract last 4-digit number from a series in R using stringr

问题 I would like to flatten lists extracted from HTML tables. A minimal working example is presented below. The example depends on the stringr package in R. The first example exhibits the desired behavior. years <- c("2005-", "2003-") unlist(str_extract_all(years,"[[:digit:]]{4}")) [1] "2005" "2003" The below example produces an undesirable result when I try to match the last 4-digit number in a series of other numbers. years1 <- c("2005-", "2003-", "1984-1992, 1996-") unlist(str_extract_all

extract number after specific string

阅读更多关于 extract number after specific string

问题 I need to find the number after the string "Count of". There could be a space or a symbol between the "Count of" string and the number. I have something that works on www.regex101.com but does not work with stringr str_extract function. library(stringr) shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10") str_extract(shopping_list, "count of ([\\d]+)") [1] NA NA NA NA "count of 5"

dplyr: inner_join with a partial string match

阅读更多关于 dplyr: inner_join with a partial string match

问题 I'd like to join two data frames if the seed column in data frame y is a partial match on the string column in x . This example should illustrate: # What I have x <- data.frame(idX=1:3, string=c("Motorcycle", "TractorTrailer", "Sailboat")) y <- data_frame(idY=letters[1:3], seed=c("ractor", "otorcy", "irplan")) x idX string 1 1 Motorcycle 2 2 TractorTrailer 3 3 Sailboat y Source: local data frame [3 x 2] idY seed (chr) (chr) 1 a ractor 2 b otorcy 3 c irplan # What I want want <- data.frame(idX

How to escape a backslash in R? [duplicate]

阅读更多关于 How to escape a backslash in R? [duplicate]

问题 This question already has an answer here: How to escape backslashes in R string 3 answers I\'m working in R and having troubles escaping the backslash. I am using the library stringr . install.packages(\"stringr\", repos=\'http://cran.us.r-project.org\') library(\"stringr\") I would like to do str = str_replace_all(str, \"\\\", \"\") So I tried str = str_replace_all(str, \"\\\\\", \"\") but it won\'t work. What should I do? 回答1: I found a solution that works str = gsub("([\\])","", str) 回答2:

Remove all text between two brackets

阅读更多关于 Remove all text between two brackets

问题 Suppose I have some text like this, text<-c(\"[McCain]: We need tax policies that respect the wage earners and job creators. [Obama]: It\'s harder to save. It\'s harder to retire. [McCain]: The biggest problem with American healthcare system is that it costs too much. [Obama]: We will have a healthcare system, not a disease-care system. We have the chance to solve problems that we\'ve been talking about... [Text on screen]: Senators McCain and Obama are talking about your healthcare and

Extracting a string between other two strings in R

阅读更多关于 Extracting a string between other two strings in R

问题 I am trying to find a simple way to extract an unknown substring (could be anything) that appear between two known substrings. For example, I have a string: a<-\" anything goes here, STR1 GET_ME STR2, anything goes here\" I need to extract the string GET_ME which is between STR1 and STR2 (without the white spaces). I am trying str_extract(a, \"STR1 (.+) STR2\") , but I am getting the entire match [1] \"STR1 GET_ME STR2\" I can of course strip the known strings, to isolate the substring I need