stringr

count the number of occurrences of “(” in a string

三世轮回 提交于 2019-11-28 13:18:43
I am trying to get the number of open brackets in a character string in R. I am using the str_count function from the stringr package s<- "(hi),(bye),(hi)" str_count(s,"(") Error in stri_count_regex(string, pattern, opts_regex = attr(pattern, : ` Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN) I am hoping to get 3 for this example ( is a special character. You need to escape it: str_count(s,"\\(") # [1] 3 Alternatively, given that you're using stringr , you can use the coll function: str_count(s,coll("(")) # [1] 3 If you want to do it in base R you can split into a

regex multiple pattern with singular replacement

北城以北 提交于 2019-11-28 07:21:03
I am trying to replace both "st." and "ste." with "st". Seems like the following should work but it does not: require("stringr") county <- c("st. landry", "ste. geneveve", "st. louis") str_replace_all(county, c("st\\.", "ste\\."), "st") You can use | to mean "or" > str_replace_all(county, "st\\.|ste\\.", "st") [1] "st landry" "st geneveve" "st louis" Or in base R > gsub("st\\.|ste\\.", "st", county) [1] "st landry" "st geneveve" "st louis" > A<-"this string, contains a handful of, useless: punctuation. Some are to escape. Aaargh! Some might be needed, but I want none!" > gsub(", |: |\\. |!",""

Remove URLs from string

不羁的心 提交于 2019-11-28 07:04:19
I have a vector of strings— myStrings —in R that look something like: [1] download file from `http://example.com` [2] this is the link to my website `another url` [3] go to `another url` from more info. where another url is a valid http url but stackoverflow will not let me insert more than one url thats why i'm writing another url instead. I want to remove all the urls from myStrings to look like: [1] download file from [2] this is the link to my website [3] go to from more info. I've tried many functions in the stringr package but nothing works. You can use gsub with a regular expression to

R count times word appears in element of list

天涯浪子 提交于 2019-11-28 06:31:58
问题 I have a list comprised of words. > head(splitWords2) [[1]] [1] "Some" "additional" "information" "that" "we" "would" "need" "to" "replicate" "the" [11] "experiment" "is" "how" "much" "vinegar" "should" "be" "placed" "in" "each" [21] "identical" "container" "or" "what" "tool" "use" "measure" "mass" "of" "four" [31] "different" "samples" "and" "distilled" "water" "rinse" "after" "taking" "them" "out" [[2]] [1] "After" "reading" "the" "expirement" "I" "realized" "that" "additional" "information

Extract last 4-digit number from a series in R using stringr

一世执手 提交于 2019-11-27 14:51:19
I would like to flatten lists extracted from HTML tables. A minimal working example is presented below. The example depends on the stringr package in R. The first example exhibits the desired behavior. years <- c("2005-", "2003-") unlist(str_extract_all(years,"[[:digit:]]{4}")) [1] "2005" "2003" The below example produces an undesirable result when I try to match the last 4-digit number in a series of other numbers. years1 <- c("2005-", "2003-", "1984-1992, 1996-") unlist(str_extract_all(years1,"[[:digit:]]{4}$")) character(0) As I understand the documentation, I should include $ at the end of

extract number after specific string

℡╲_俬逩灬. 提交于 2019-11-27 09:33:09
I need to find the number after the string "Count of". There could be a space or a symbol between the "Count of" string and the number. I have something that works on www.regex101.com but does not work with stringr str_extract function. library(stringr) shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10") str_extract(shopping_list, "count of ([\\d]+)") [1] NA NA NA NA "count of 5" "count of 50" NA What I want to get: [1] NA NA NA NA "5" "50" "10" str_extract(shopping_list, "(?i)(?<=count

dplyr: inner_join with a partial string match

佐手、 提交于 2019-11-27 07:51:48
I'd like to join two data frames if the seed column in data frame y is a partial match on the string column in x . This example should illustrate: # What I have x <- data.frame(idX=1:3, string=c("Motorcycle", "TractorTrailer", "Sailboat")) y <- data_frame(idY=letters[1:3], seed=c("ractor", "otorcy", "irplan")) x idX string 1 1 Motorcycle 2 2 TractorTrailer 3 3 Sailboat y Source: local data frame [3 x 2] idY seed (chr) (chr) 1 a ractor 2 b otorcy 3 c irplan # What I want want <- data.frame(idX=c(1,2), idY=c("b", "a"), string=c("Motorcycle", "TractorTrailer"), seed=c("otorcy", "ractor")) want

count the number of occurrences of “(” in a string

别来无恙 提交于 2019-11-27 07:36:56
问题 I am trying to get the number of open brackets in a character string in R. I am using the str_count function from the stringr package s<- "(hi),(bye),(hi)" str_count(s,"(") Error in stri_count_regex(string, pattern, opts_regex = attr(pattern, : ` Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN) I am hoping to get 3 for this example 回答1: ( is a special character. You need to escape it: str_count(s,"\\(") # [1] 3 Alternatively, given that you're using stringr , you

Remove URLs from string

久未见 提交于 2019-11-27 01:46:38
问题 I have a vector of strings— myStrings —in R that look something like: [1] download file from `http://example.com` [2] this is the link to my website `another url` [3] go to `another url` from more info. where another url is a valid http url but stackoverflow will not let me insert more than one url thats why i'm writing another url instead. I want to remove all the urls from myStrings to look like: [1] download file from [2] this is the link to my website [3] go to from more info. I've tried

regex multiple pattern with singular replacement

廉价感情. 提交于 2019-11-27 01:30:48
问题 I am trying to replace both "st." and "ste." with "st". Seems like the following should work but it does not: require("stringr") county <- c("st. landry", "ste. geneveve", "st. louis") str_replace_all(county, c("st\\.", "ste\\."), "st") 回答1: You can use | to mean "or" > str_replace_all(county, "st\\.|ste\\.", "st") [1] "st landry" "st geneveve" "st louis" Or in base R > gsub("st\\.|ste\\.", "st", county) [1] "st landry" "st geneveve" "st louis" 回答2: > A<-"this string, contains a handful of,