gsub

Remove everything after a string in a data frame column with missing values

走远了吗. 提交于 2019-11-28 05:55:45
问题 I have a data frame resembling the extract below: Observation Identifier Value Obs001 ABC_2001 54 Obs002 ABC_2002 -2 Obs003 1 Obs004 1 Obs005 Def_2001/05 I would like to transform this data frame into a data frame where portions of the string after the "_" sign would be removed: as illustrated below: Observation Identifier_NoTime Value Obs001 ABC 54 Obs002 ABC -2 Obs003 1 Obs004 1 Obs005 Def I tried experimenting with strsplit , gsub and sub as discussed here but cannot force those commends

Ignoring a character along with word boundary in regex

十年热恋 提交于 2019-11-28 05:52:04
问题 I am using gsub in Ruby to make a word within text bold. I am using a word boundary so as to not make letters within other words bold, but am finding that this ignores words that have a quote after them. For example: text.gsub(/#{word}\b/i, "<b>#{word}</b>") text = "I said, 'look out below'" word = below In this case the word below is not made bold. Is there any way to ignore certain characters along with a word boundary? 回答1: All that escaping in the Regexp.new is looking quite ugly. You

How to extract substring between patterns “_” and “.” in R [duplicate]

巧了我就是萌 提交于 2019-11-28 05:35:03
问题 This question already has answers here : Extract a string between patterns/delimiters in R (4 answers) Closed 5 years ago . I have many filenames which look like: txt= "MA0051_IRF2.xml" I want to extract IRF2 which is between "_" and ".". How do I do this in R? 回答1: To achieve this, you need a regexp that matches an (optional) arbitrary string in front of the _ : .* matches a literal _ : [_] matches everything up to (but not including) the next . and stores it in capturing group no. 1 : ([^.]

Remove strings found in vector 1, from vector 2

匆匆过客 提交于 2019-11-27 16:26:47
I have these two vectors: sample1 <- c(".aaa", ".aarp", ".abb", ".abbott", ".abogado") sample2 <- c("try1.aarp", "www.tryagain.aaa", "255.255.255.255", "onemoretry.abb.abogado") I am trying to remove sample1 strings that are found in sample2. The closest I got is by iterating using sapply , which gave me this: sapply(sample1, function(i)gsub(i, "", sample2)) .aaa .aarp .abb .abbott .abogado [1,] "try1.aarp" "try1" "try1.aarp" "try1.aarp" "try1.aarp" [2,] "www.tryagain" "www.tryagain.aaa" "www.tryagain.aaa" "www.tryagain.aaa" "www.tryagain.aaa" [3,] "255.255.255.255" "255.255.255.255" "255.255

Replace characters in column names gsub

狂风中的少年 提交于 2019-11-27 16:24:31
问题 I am reading in a bunch of CSVs that have stuff like "sales - thousands" in the title and come into R as "sales...thousands". I'd like to use a regular expression (or other simple method) to clean these up. I can't figure out why this doesn't work: #mock data a <- data.frame(this.is.fine = letters[1:5], this...one...isnt = LETTERS[1:5]) #column names colnames(a) # [1] "this.is.fine" "this...one...isnt" #function to remove multiple spaces colClean <- function(x){ colnames(x) <- gsub("\\.\\.+",

How to remove specific special characters in R

爷,独闯天下 提交于 2019-11-27 16:01:26
问题 I have some sentences like this one. c = "In Acid-base reaction (page[4]), why does it create water and not H+?" I want to remove all special characters except for '?&+-/ I know that if I want to remove all special characters, I can simply use gsub("[[:punct:]]", "", c) "In Acidbase reaction page4 why does it create water and not H" However, some special characters such as + - ? are also removed, which I intend to keep. I tried to create a string of special characters that I can use in some

replacing the `'` char using awk

老子叫甜甜 提交于 2019-11-27 12:59:52
问题 I have lines with a single : and a ' in them that I want to get rid of. I want to use awk for this. I've tried using: awk '{gsub ( "[:\\']","" ) ; print $0 }' and awk '{gsub ( "[:\']","" ) ; print $0 }' and awk '{gsub ( "[:']","" ) ; print $0 }' non of them worked, but return the error Unmatched ". . when I put awk '{gsub ( "[:_]","" ) ; print $0 }' then It works and removes all : and _ chars. How can I get rid of the ' char? 回答1: You could use: Octal code for the single quote: [:\47] The

Ruby multiple string replacement

心已入冬 提交于 2019-11-27 10:46:51
str = "Hello☺ World☹" Expected output is: "Hello:) World:(" I can do this: str.gsub("☺", ":)").gsub("☹", ":(") Is there any other way so that I can do this in a single function call?. Something like: str.gsub(['s1', 's2'], ['r1', 'r2']) Naren Sisodiya Since Ruby 1.9.2, String#gsub accepts hash as a second parameter for replacement with matched keys. You can use a regular expression to match the substring that needs to be replaced and pass hash for values to be replaced. Like this: 'hello'.gsub(/[eo]/, 'e' => 3, 'o' => '*') #=> "h3ll*" '(0) 123-123.123'.gsub(/[()-,. ]/, '') #=> "0123123123" In

How to remove unicode <U+00A6> from string?

随声附和 提交于 2019-11-27 09:44:42
I have a string like: q <-"<U+00A6> 1000-66329" I want to remove <U+00A6> and get only 1000 66329 . I tried using: gsub("\u00a6"," ", q,perl=T) But it is not removing anything. How should I do gsub in order to get only 1000 66329 ? Wiktor Stribiżew I just want to remove unicode <U+00A6> which is at the beginning of string. Then you do not need a gsub , you can use a sub with "^\\s*<U\\+\\w+>\\s*" pattern: q <-"<U+00A6> 1000-66329" sub("^\\s*<U\\+\\w+>\\s*", "", q) Pattern details : ^ - start of string \\s* - zero or more whitespaces <U\\+ - a literal char sequence <U+ \\w+ - 1 or more letters,

How to replace multiple strings with the same in R

∥☆過路亽.° 提交于 2019-11-27 09:42:07
I have a string vec = c('blue','red','flower','bee') I want to convert different strings into the same in one line instead of seperately i.e. i could gsub blue and gsub red to make them both spell 'colour'. How can I do this in one line? output should be: 'colour','colour','flower','bee' sub("blue|red", "colour", vec) use "|" (meening or) between the words you want to sub. use sub to change only the first occurence and gsub to change multiple occurences within the same string. See ?gsub Here you do not need to specify the colors to be replaced, it will replace any color that R knows about