stringr

How to escape a backslash in R? [duplicate]

丶灬走出姿态 提交于 2019-11-26 22:32:37
This question already has an answer here: How to escape backslashes in R string 3 answers I'm working in R and having troubles escaping the backslash. I am using the library stringr . install.packages("stringr", repos='http://cran.us.r-project.org') library("stringr") I would like to do str = str_replace_all(str, "\", "") So I tried str = str_replace_all(str, "\\", "") but it won't work. What should I do? I found a solution that works str = gsub("([\\])","", str) Use Hmisc::escapeRegex and Hmisc::escapeBS which automatically escapes backslashes and other regex special characters. 来源: https:/

Remove all text between two brackets

谁说胖子不能爱 提交于 2019-11-26 20:47:06
Suppose I have some text like this, text<-c("[McCain]: We need tax policies that respect the wage earners and job creators. [Obama]: It's harder to save. It's harder to retire. [McCain]: The biggest problem with American healthcare system is that it costs too much. [Obama]: We will have a healthcare system, not a disease-care system. We have the chance to solve problems that we've been talking about... [Text on screen]: Senators McCain and Obama are talking about your healthcare and financial security. We need more than talk. [Obama]: ...year after year after year after year. [Announcer]: Call

Non-greedy string regular expression matching

被刻印的时光 ゝ 提交于 2019-11-26 20:14:32
问题 I'm pretty sure I'm missing something obvious here, but I cannot make R to use non-greedy regular expressions: > library(stringr) > str_match('xxx aaaab yyy', "a.*?b") [,1] [1,] "aaaab" Base functions behave the same way: > regexpr('a.*?b', 'xxx aaaab yyy') [1] 5 attr(,"match.length") [1] 5 attr(,"useBytes") [1] TRUE I would expect the match to be just ab as per 'greedy' comment in http://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html: By default repetition is greedy, so the

Extracting a string between other two strings in R

孤者浪人 提交于 2019-11-26 17:58:41
I am trying to find a simple way to extract an unknown substring (could be anything) that appear between two known substrings. For example, I have a string: a<-" anything goes here, STR1 GET_ME STR2, anything goes here" I need to extract the string GET_ME which is between STR1 and STR2 (without the white spaces). I am trying str_extract(a, "STR1 (.+) STR2") , but I am getting the entire match [1] "STR1 GET_ME STR2" I can of course strip the known strings, to isolate the substring I need, but I think there should be a cleaner way to do it by using a correct regular expression. You may use str

Extract last 4-digit number from a series in R using stringr

为君一笑 提交于 2019-11-26 16:55:46
问题 I would like to flatten lists extracted from HTML tables. A minimal working example is presented below. The example depends on the stringr package in R. The first example exhibits the desired behavior. years <- c("2005-", "2003-") unlist(str_extract_all(years,"[[:digit:]]{4}")) [1] "2005" "2003" The below example produces an undesirable result when I try to match the last 4-digit number in a series of other numbers. years1 <- c("2005-", "2003-", "1984-1992, 1996-") unlist(str_extract_all

extract number after specific string

我与影子孤独终老i 提交于 2019-11-26 14:47:38
问题 I need to find the number after the string "Count of". There could be a space or a symbol between the "Count of" string and the number. I have something that works on www.regex101.com but does not work with stringr str_extract function. library(stringr) shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10") str_extract(shopping_list, "count of ([\\d]+)") [1] NA NA NA NA "count of 5"

dplyr: inner_join with a partial string match

ぐ巨炮叔叔 提交于 2019-11-26 13:44:43
问题 I'd like to join two data frames if the seed column in data frame y is a partial match on the string column in x . This example should illustrate: # What I have x <- data.frame(idX=1:3, string=c("Motorcycle", "TractorTrailer", "Sailboat")) y <- data_frame(idY=letters[1:3], seed=c("ractor", "otorcy", "irplan")) x idX string 1 1 Motorcycle 2 2 TractorTrailer 3 3 Sailboat y Source: local data frame [3 x 2] idY seed (chr) (chr) 1 a ractor 2 b otorcy 3 c irplan # What I want want <- data.frame(idX

How to escape a backslash in R? [duplicate]

孤街浪徒 提交于 2019-11-26 08:24:41
问题 This question already has an answer here: How to escape backslashes in R string 3 answers I\'m working in R and having troubles escaping the backslash. I am using the library stringr . install.packages(\"stringr\", repos=\'http://cran.us.r-project.org\') library(\"stringr\") I would like to do str = str_replace_all(str, \"\\\", \"\") So I tried str = str_replace_all(str, \"\\\\\", \"\") but it won\'t work. What should I do? 回答1: I found a solution that works str = gsub("([\\])","", str) 回答2:

Remove all text between two brackets

自闭症网瘾萝莉.ら 提交于 2019-11-26 07:45:30
问题 Suppose I have some text like this, text<-c(\"[McCain]: We need tax policies that respect the wage earners and job creators. [Obama]: It\'s harder to save. It\'s harder to retire. [McCain]: The biggest problem with American healthcare system is that it costs too much. [Obama]: We will have a healthcare system, not a disease-care system. We have the chance to solve problems that we\'ve been talking about... [Text on screen]: Senators McCain and Obama are talking about your healthcare and

Extracting a string between other two strings in R

坚强是说给别人听的谎言 提交于 2019-11-26 04:27:02
问题 I am trying to find a simple way to extract an unknown substring (could be anything) that appear between two known substrings. For example, I have a string: a<-\" anything goes here, STR1 GET_ME STR2, anything goes here\" I need to extract the string GET_ME which is between STR1 and STR2 (without the white spaces). I am trying str_extract(a, \"STR1 (.+) STR2\") , but I am getting the entire match [1] \"STR1 GET_ME STR2\" I can of course strip the known strings, to isolate the substring I need