gsub | 易学教程

removing everything after first 'backslash' in a string

阅读更多关于 removing everything after first 'backslash' in a string

问题 I have a vector like below vec <- c("abc\edw\www", "nmn\ggg", "rer\qqq\fdf"......) I want to remove everything after as soon as first slash is encountered, like below newvec <- c("abc","nmn","rer") Thank you. My original vector is as below (only the head) [1] "peoria ave\nste \npeoria" [2] "wood dr\nphoenix" "central ave\nphoenix" [4] "southern ave\nphoenix" [5] "happy valley rd\nste \nglendaleaz " "the americana at brand\n americana way\nglendale" Here the problem is my original csv file

Ruby regex matching a line in an inputted text file string [duplicate]

阅读更多关于 Ruby regex matching a line in an inputted text file string [duplicate]

问题 This question already has answers here : Ruby regex gsub a line in a text file (5 answers) Closed 6 years ago . I need to match a line in an inputted text file string and wrap that captured line with a character for example. For example imagine a text file as such: test foo test bar I would like to use gsub to output: XtestX XfooX XtestX XbarX I'm having trouble matching a line though. I've tried using regex starting with ^ and ending with $, but it doesn't seem to work? Any ideas? I have a

R: gsub with fixed=T or F and special cases

阅读更多关于 R: gsub with fixed=T or F and special cases

Building on top of two questions I previously asked: R: How to prevent memory overflow when using mgsub in vector mode? gsub speed vs pattern length I do like suggestions on usage of fixed=TRUE by @Tyler as it speeds up calculations significantly. However, it's not always applicable. I need to substitute, say, caps as a stand-alone word w/ or w/o punctuation that surrounds it. A priori it's not know what can follow or precede the word, but it must be any of regular punctuation signs (, . ! - + etc). It cannot be a number or a letter. Example below. capsule must stay as is. i = "Here is the

regex to remove words if it contains a letter/special character multiple times simultaneously in R

阅读更多关于 regex to remove words if it contains a letter/special character multiple times simultaneously in R

问题 I want to remove those words where the number of letters/special characters in a word occurs more than twice simultaneously. For Eg the input is like "Google in theee lland of whhhat c#, c++ and e###" and the output should be "Google in lland of c#, c++ and" 回答1: x <- "Google in theee lland of whhhat c#, c++ and e###" gsub("\\S*(\\S)\\1\\1\\S*\\s?", "", x) # [1] "Google in lland of c#, c++ and " (\\S)\\1\\1 finds sequences of three consecutive repetitions of a single non-space character. The

Delete last two characters in string if they match criteria

阅读更多关于 Delete last two characters in string if they match criteria

问题 I have 2 million names in a database. For example: df <- data.frame(names=c("A ADAM", "S BEAN", "A APPLE A", "A SCHWARZENEGGER")) > df names 1 A ADAM 2 S BEAN 3 A APPLE A 4 A SCHWARZENEGGER I want to delete ' A' (white space A) if these are the last two characters of the string. I know that regex is our friend here. How do I efficiently apply a regex function to the last two characters of the string? Desired output: > output names 1 A ADAM 2 S BEAN 3 A APPLE 4 A SCHWARZENEGGER 回答1: If you

"invalid regular expression…reason 'Trailing backslash''' error with gsub in R

阅读更多关于 "invalid regular expression…reason 'Trailing backslash''' error with gsub in R

问题 I am getting error message while replacing text in R. x [1] "Easy bruising and bleeding.\\" gsub(as.character(x), "\\", "") Error in gsub(as.character(x), "\\", "") : invalid regular expression 'Easy bruising and bleeding.\', reason 'Trailing backslash' 回答1: The arguments are in the wrong order. Study help("gsub") . gsub( "\\", "", "Easy bruising and bleeding.\\", fixed=TRUE) #[1] "Easy bruising and bleeding." 回答2: tl;dr: You need 4 \ s (i.e. \\\\ ) in the first argument of gsub in order to

Escape spaces in a linux pathname with Ruby gsub

阅读更多关于 Escape spaces in a linux pathname with Ruby gsub

问题 I am trying to escape the spaces in a Linux path. However, whenever I try to escape my backslash I end up with a double slash. Example path: /mnt/drive/site/usa/1201 East/1201 East Invoice.pdf So that I can use this in Linux I want to escape it as: /mnt/drive/site/usa/1201\ East/1201\ East\ Invoice.pdf So I'm trying this: backup_item.gsub("\s", "\\\s") But I get an unexpected output of /mnt/drive/site/usa/1201\\ East/1201\\ East\\ Invoice.pdf 回答1: Stefan is right; I just want to point out

change string in DF using hive command and mutate with sparklyr

阅读更多关于 change string in DF using hive command and mutate with sparklyr

Using the Hive command regexp_extract I am trying to change the following strings from: 201703170455 to 2017-03-17:04:55 and from: 2017031704555675 to 2017-03-17:04:55.0010 I am doing this in sparklyr trying to use this code that works with gsub in R: newdf<-df%>%mutate(Time1 = regexp_extract(Time, "(....)(..)(..)(..)(..)", "\\1-\\2-\\3:\\4:\\5")) and this code: newdf<-df%>mutate(TimeTrans = regexp_extract("(....)(..)(..)(..)(..)(....)", "\\1-\\2-\\3:\\4:\\5.\\6")) but does not work at all. Any suggestions of how to do this using regexp_extract? Apache Spark uses Java regular expression

removing everything after first 'backslash' in a string

阅读更多关于 removing everything after first 'backslash' in a string

I have a vector like below vec <- c("abc\edw\www", "nmn\ggg", "rer\qqq\fdf"......) I want to remove everything after as soon as first slash is encountered, like below newvec <- c("abc","nmn","rer") Thank you. My original vector is as below (only the head) [1] "peoria ave\nste \npeoria" [2] "wood dr\nphoenix" "central ave\nphoenix" [4] "southern ave\nphoenix" [5] "happy valley rd\nste \nglendaleaz " "the americana at brand\n americana way\nglendale" Here the problem is my original csv file does not contain backslashes, but when i read it backslashes appear. Original csv file is as below [1]

Remove all characters after the 3rd occurrence of “-” in each element of a vector

阅读更多关于 Remove all characters after the 3rd occurrence of “-” in each element of a vector

问题 I am not that good with regular expressions in R. I would like to remove all characters after the 3rd occurrence of "-" in each element of a vector. Initial string aa-bbb-cccc => aa-bbb aa-vvv-vv => aa-vvv aa-ddd => aa-ddd Any help? 回答1: Judging by the sample input and expected output, I assume you need to remove all beginning with the 2nd hyphen. You may use sub("^([^-]*-[^-]*).*", "\\1", x) See the regex demo Details : ^ - start of string ([^-]*-[^-]*) - Group 1 capturing 0+ chars other