gsub

removing everything after first 'backslash' in a string

被刻印的时光 ゝ 提交于 2019-12-08 02:53:19
问题 I have a vector like below vec <- c("abc\edw\www", "nmn\ggg", "rer\qqq\fdf"......) I want to remove everything after as soon as first slash is encountered, like below newvec <- c("abc","nmn","rer") Thank you. My original vector is as below (only the head) [1] "peoria ave\nste \npeoria" [2] "wood dr\nphoenix" "central ave\nphoenix" [4] "southern ave\nphoenix" [5] "happy valley rd\nste \nglendaleaz " "the americana at brand\n americana way\nglendale" Here the problem is my original csv file

Ruby regex matching a line in an inputted text file string [duplicate]

微笑、不失礼 提交于 2019-12-08 02:27:33
问题 This question already has answers here : Ruby regex gsub a line in a text file (5 answers) Closed 6 years ago . I need to match a line in an inputted text file string and wrap that captured line with a character for example. For example imagine a text file as such: test foo test bar I would like to use gsub to output: XtestX XfooX XtestX XbarX I'm having trouble matching a line though. I've tried using regex starting with ^ and ending with $, but it doesn't seem to work? Any ideas? I have a

R: gsub with fixed=T or F and special cases

核能气质少年 提交于 2019-12-07 11:32:24
Building on top of two questions I previously asked: R: How to prevent memory overflow when using mgsub in vector mode? gsub speed vs pattern length I do like suggestions on usage of fixed=TRUE by @Tyler as it speeds up calculations significantly. However, it's not always applicable. I need to substitute, say, caps as a stand-alone word w/ or w/o punctuation that surrounds it. A priori it's not know what can follow or precede the word, but it must be any of regular punctuation signs (, . ! - + etc). It cannot be a number or a letter. Example below. capsule must stay as is. i = "Here is the

regex to remove words if it contains a letter/special character multiple times simultaneously in R

别来无恙 提交于 2019-12-07 09:45:55
问题 I want to remove those words where the number of letters/special characters in a word occurs more than twice simultaneously. For Eg the input is like "Google in theee lland of whhhat c#, c++ and e###" and the output should be "Google in lland of c#, c++ and" 回答1: x <- "Google in theee lland of whhhat c#, c++ and e###" gsub("\\S*(\\S)\\1\\1\\S*\\s?", "", x) # [1] "Google in lland of c#, c++ and " (\\S)\\1\\1 finds sequences of three consecutive repetitions of a single non-space character. The

Delete last two characters in string if they match criteria

与世无争的帅哥 提交于 2019-12-07 09:43:52
问题 I have 2 million names in a database. For example: df <- data.frame(names=c("A ADAM", "S BEAN", "A APPLE A", "A SCHWARZENEGGER")) > df names 1 A ADAM 2 S BEAN 3 A APPLE A 4 A SCHWARZENEGGER I want to delete ' A' (white space A) if these are the last two characters of the string. I know that regex is our friend here. How do I efficiently apply a regex function to the last two characters of the string? Desired output: > output names 1 A ADAM 2 S BEAN 3 A APPLE 4 A SCHWARZENEGGER 回答1: If you

"invalid regular expression…reason 'Trailing backslash''' error with gsub in R

核能气质少年 提交于 2019-12-07 06:33:37
问题 I am getting error message while replacing text in R. x [1] "Easy bruising and bleeding.\\" gsub(as.character(x), "\\", "") Error in gsub(as.character(x), "\\", "") : invalid regular expression 'Easy bruising and bleeding.\', reason 'Trailing backslash' 回答1: The arguments are in the wrong order. Study help("gsub") . gsub( "\\", "", "Easy bruising and bleeding.\\", fixed=TRUE) #[1] "Easy bruising and bleeding." 回答2: tl;dr: You need 4 \ s (i.e. \\\\ ) in the first argument of gsub in order to

Escape spaces in a linux pathname with Ruby gsub

百般思念 提交于 2019-12-06 20:29:16
问题 I am trying to escape the spaces in a Linux path. However, whenever I try to escape my backslash I end up with a double slash. Example path: /mnt/drive/site/usa/1201 East/1201 East Invoice.pdf So that I can use this in Linux I want to escape it as: /mnt/drive/site/usa/1201\ East/1201\ East\ Invoice.pdf So I'm trying this: backup_item.gsub("\s", "\\\s") But I get an unexpected output of /mnt/drive/site/usa/1201\\ East/1201\\ East\\ Invoice.pdf 回答1: Stefan is right; I just want to point out

change string in DF using hive command and mutate with sparklyr

倾然丶 夕夏残阳落幕 提交于 2019-12-06 16:28:32
Using the Hive command regexp_extract I am trying to change the following strings from: 201703170455 to 2017-03-17:04:55 and from: 2017031704555675 to 2017-03-17:04:55.0010 I am doing this in sparklyr trying to use this code that works with gsub in R: newdf<-df%>%mutate(Time1 = regexp_extract(Time, "(....)(..)(..)(..)(..)", "\\1-\\2-\\3:\\4:\\5")) and this code: newdf<-df%>mutate(TimeTrans = regexp_extract("(....)(..)(..)(..)(..)(....)", "\\1-\\2-\\3:\\4:\\5.\\6")) but does not work at all. Any suggestions of how to do this using regexp_extract? Apache Spark uses Java regular expression

removing everything after first 'backslash' in a string

限于喜欢 提交于 2019-12-06 13:32:46
I have a vector like below vec <- c("abc\edw\www", "nmn\ggg", "rer\qqq\fdf"......) I want to remove everything after as soon as first slash is encountered, like below newvec <- c("abc","nmn","rer") Thank you. My original vector is as below (only the head) [1] "peoria ave\nste \npeoria" [2] "wood dr\nphoenix" "central ave\nphoenix" [4] "southern ave\nphoenix" [5] "happy valley rd\nste \nglendaleaz " "the americana at brand\n americana way\nglendale" Here the problem is my original csv file does not contain backslashes, but when i read it backslashes appear. Original csv file is as below [1]

Remove all characters after the 3rd occurrence of “-” in each element of a vector

痞子三分冷 提交于 2019-12-06 11:41:58
问题 I am not that good with regular expressions in R. I would like to remove all characters after the 3rd occurrence of "-" in each element of a vector. Initial string aa-bbb-cccc => aa-bbb aa-vvv-vv => aa-vvv aa-ddd => aa-ddd Any help? 回答1: Judging by the sample input and expected output, I assume you need to remove all beginning with the 2nd hyphen. You may use sub("^([^-]*-[^-]*).*", "\\1", x) See the regex demo Details : ^ - start of string ([^-]*-[^-]*) - Group 1 capturing 0+ chars other