gsub | 易学教程

R gsub a single double quotation mark

阅读更多关于 R gsub a single double quotation mark

问题 I have a field of strings in a data frame all similar to: "Young Adult – 8-9"" where the inner single " is what I want to replace with nothing to get: "Young Adult - 8-9" How can I do this? I tried to escape with a double backslash: gsub("\\"", "", string) but got this error: Error: unexpected string constant in "gsub("\"", "" 回答1: You do not need to escape a double quote in a regular expression. Just use "\"" or '"' to match a single double quote. s = "Young Adult – 8-9\"" s [1] "Young Adult

How to escape closed bracket “]” in regex in R

阅读更多关于 How to escape closed bracket “]” in regex in R

问题 I'm trying to use gsub in R to replace a bunch of weird characters in some strings I'm processing. Everything works, except whenever I throw in "]" it makes the whole thing do nothing. I'm using \\ like gsub("[\\?\\*\\]]", "", name) but it's still not working. Here's my actual example: name <- "R U Still Down? [Remember Me]" what I want is: names to be "R U Still Down Remember Me" when I do: names <- gsub("[\$\$\\*\\$\\+\\?'\\[]", "", name) it semi-works and I get "R U Still Down Remember

Remove everything before the last space

阅读更多关于 Remove everything before the last space

I have a following string. I tried to remove all the strings before the last space but it seems I can't achieve it. I tried to follow this post Use gsub remove all string before first white space in R str <- c("Veni vidi vici") gsub("\\s*","\\1",str) "Venividivici" What I want to have is only "vici" string left after removing everything before the last space. Your gsub("\\s*","\\1",str) code replaces each occurrence of 0 or more whitespaces with a reference to the capturing group #1 value (which is an empty string since you have not specified any capturing group in the pattern). You want to

Remove everything before the last space

阅读更多关于 Remove everything before the last space

问题 I have a following string. I tried to remove all the strings before the last space but it seems I can't achieve it. I tried to follow this post Use gsub remove all string before first white space in R str <- c("Veni vidi vici") gsub("\\s*","\\1",str) "Venividivici" What I want to have is only "vici" string left after removing everything before the last space. 回答1: Your gsub("\\s*","\\1",str) code replaces each occurrence of 0 or more whitespaces with a reference to the capturing group #1

How to extract substring between patterns “_” and “.” in R [duplicate]

阅读更多关于 How to extract substring between patterns “_” and “.” in R [duplicate]

This question already has an answer here: Extract a string between patterns/delimiters in R 4 answers I have many filenames which look like: txt= "MA0051_IRF2.xml" I want to extract IRF2 which is between "_" and ".". How do I do this in R? To achieve this, you need a regexp that matches an (optional) arbitrary string in front of the _ : .* matches a literal _ : [_] matches everything up to (but not including) the next . and stores it in capturing group no. 1 : ([^.]+) matches a literal . : [.] matches an (optional) arbitrary string after the . : .* In your call to gsub, you then use the

Split string by final space in R

阅读更多关于 Split string by final space in R

I have a vector a strings with a number of spaces in. I would like to split this into two vectors split by the final space. For example: vec <- c('This is one', 'And another', 'And one more again') Should become vec1 = c('This is', 'And', 'And one more again') vec2 = c('one', 'another', 'again') Is there a quick and easy way to do this? I have done similar things before using gsub and regex, and have managed to get the second vector using the following vec2 <- gsub(".* ", "", vec) But can't work out how to get vec1. Thanks in advance Here is one way using a lookahead assertion: do.call(rbind,

How can I remove non-numeric characters from strings using gsub in R?

阅读更多关于 How can I remove non-numeric characters from strings using gsub in R?

问题 I use the gsub function in R to remove unwanted characters in numbers. So I should remove from the strings every character that is not a number, . , and - . My problem is that the regular expression is not removing some non-numeric characters like d , + , and < . Below are my regular expression, the gsub execution, and its output. How can I change the regular expression in order to achieve the desired output? Current output: gsub(pattern = '[^(-?(\\d*\\.)?\\d+)]', replacement = '', x = c('1.2

Replacing the specific values in columns of data frame using gsub in R

阅读更多关于 Replacing the specific values in columns of data frame using gsub in R

问题 I have data.frame as follows > df ID Value A_001 DEL-1:7:35-8_1 A_002 INS-4l:5_74:d B_023 0 C_891 2 D_787 8 E_865 DEL-3:65:1s:b I would like replace all the values in the column Value that starts with DEL and INS with nothing. I mean i would like get the output as follows > df ID Value A_001 A_002 B_023 0 C_891 2 D_787 8 E_865 I tried to achieve this using gsub in R using following code but it didnt work gsub(pattern="(^([DEL|INS]*)",replacement="",df) Could anyone guide me how to achieve the

check capital words in text and extract it

阅读更多关于 check capital words in text and extract it

I want to extract all capital words from the text. Lets say my data is like--> Text<-c('I am JAY','I AM NOT HAPPY','YOU ARE IRRITATING','so Funny','hEY) So output should be like --> > output [1] "I JAY" "I AM NOT HAPPY" "YOU ARE IRRITATING" "" "" Please help me for this. Another option is library(stringr) sapply(str_extract_all(Text, '\\b[A-Z]+\\b'), paste, collapse=' ') # [1] "I JAY" "I AM NOT HAPPY" "YOU ARE IRRITATING" #[4] "" "" Or gsub("[a-z][A-Za-z]+|[A-Za-z][a-z]+", '', Text) #[1] "I JAY" "I AM NOT HAPPY" "YOU ARE IRRITATING" #[4] " " "" data Text<-c('I am JAY','I AM NOT HAPPY','YOU ARE

Remove extra white space from between letters in R using gsub()

阅读更多关于 Remove extra white space from between letters in R using gsub()

问题 There are a slew of answers out there on how to remove extra whitespace from between words, which is super simple. However, I'm finding that removing extra whitespace within words is much harder. As a reproducible example, let's say I have a vector of data that looks like this: x <- c("L L C", "P O BOX 123456", "NEW YORK") What I'd like to do is something like this: y <- gsub("(\\w)(\\s)(\\w)(\\s)", "\\1\\3", x) But that leaves me with this: [1] "LLC" "POBOX 123456" "NEW YORK" Almost perfect,