gsub | 易学教程

Regular Expression For Consecutive Duplicate Bigrams

阅读更多关于 Regular Expression For Consecutive Duplicate Bigrams

问题 My question is a direct extension of this earlier question about detecting consecutive words (unigrams) in a string. In the previous question, Not that that is related could be detected via this regex: \b(\w+)\s+\1\b Here, I want to detect consecutive bigrams (pairs of words): are blue and then and then very bright Ideally, I also want to know how to replace the detected pattern (duplicate) by a single element, so as to obtain in the end: are blue and then very bright (for this application,

Pattern matching and replacement in R

阅读更多关于 Pattern matching and replacement in R

问题 I am not familiar at all with regular expressions, and would like to do pattern matching and replacement in R. I would like to replace the pattern #1 , #2 in the vector: original = c("#1", "#2", "#10", "#11") with each value of the vector vec = c(1,2) . The result I am looking for is the following vector: c("1", "2", "#10", "#11") I am not sure how to do that. I tried doing: for(i in 1:2) { pattern = paste("#", i, sep = "") original = gsub(pattern, vec[i], original, fixed = TRUE) } but I get

pik Error: private method `gsub' called for nil:NilClass

阅读更多关于 pik Error: private method `gsub' called for nil:NilClass

问题 I´m getting an error when adding JRuby 1.3.1 to pik. Error: Error: private method `gsub' called for nil:NilClass this is my path environment variable: C:\Users\Owner>echo %path% C:\Program Files\Java\jdk1.7.0_05\bin;c:\jruby-1.7.0.preview1\bin;C:\jruby-1.3.1\bin;c:\pik this is when trying to add 1.3.1 version: C:\Users\Owner>pik add C:\jruby-1.3.1\bin There was an error. Error: private method `gsub' called for nil:NilClass in: pik/commands/command.rb:124:in `get_version' in: pik/commands/add

Replace substring every >n characters (conditionally insert linebreaks for spaces)

阅读更多关于 Replace substring every >n characters (conditionally insert linebreaks for spaces)

问题 I would like to replace spaces with linebreaks ( \n ) in a pretty long chracter vector in R. However, I don't want to replace every space, but only if the substring exeeds a certain number of characters ( n ). Example: mystring <- "this string is annoyingly long and therefore I would like to insert linebreaks" Now I want to insert linebreaks in mystring at every space on the condition that each substring has a length greater than 20 characters ( nchar > 20 ). Hence, the resulting string is

R: combine several gsub() function in a pipe

阅读更多关于 R: combine several gsub() function in a pipe

问题 To clean some messy data I would like to start using pipes %>% , but I fail to get the R code working if gsub() is not at the beginning of the pipe, should occur late (Note: this question is not concerned with proper import, but with data cleaning). Simple example: df <- cbind.data.frame(A= c("2.187,78 ", "5.491,28 ", "7.000,32 "), B = c("A","B","C")) Column A contains characters (in this case numbers, but this also could be string) and need to be cleaned. The steps are df$D <- gsub("\\.",""

Ruby/Rails working with gsub and arrays

阅读更多关于 Ruby/Rails working with gsub and arrays

问题 I have a string that I am trying to work with in using the gsub method in Ruby. The problem is that I have a dynamic array of strings that I need to iterate through to search the original text for and replace with. For example if I have the following original string (This is some sample text that I am working with and will hopefully get it all working) and have an array of items I want to search through and replace. Thanks for the help in advance! 回答1: a = ['This is some sample text', 'This

Formatting month abbreviations using as.Date [duplicate]

阅读更多关于 Formatting month abbreviations using as.Date [duplicate]

问题 This question already has answers here : Converting year and month (“yyyy-mm” format) to a date? (7 answers) Closed 2 years ago . I'm working with monthly data and have a character vector of dates, formatted: Sep/2012 Aug/2012 Jul/2012 and so on, back to 1981. I've tried using as.Date(dates, "%b/%Y") where %b represents month abbreviations, but this only returns NAs. What am I doing wrong? Note: I already found a workaround using gsub() to add "01/" in front of each entry, like so: 01/Sep

Search-and-replace on a list of strings - gsub eapply?

阅读更多关于 Search-and-replace on a list of strings - gsub eapply?

问题 Here is a simplified excerpt of my code for reproduction purposes: library("quantmod") stockData <- new.env() stocksLst <- c("AAB.TO", "BBD-B.TO", "BB.TO", "ZZZ.TO") nrstocks = length(stocksLst) startDate = as.Date("2016-09-01") for (i in 1:nrstocks) { getSymbols(stocksLst[i], env = stockData, src = "yahoo", from = startDate) } My data is then stored in this environment stockData which I use to do some analysis. I'd like to clean up the names of the xts objects, which are currently: ls

remove multiple patterns from text vector r

阅读更多关于 remove multiple patterns from text vector r

问题 I want to remove multiple patterns from multiple character vectors. Currently I am going: a.vector <- gsub("@\\w+", "", a.vector) a.vector <- gsub("http\\w+", "", a.vector) a.vector <- gsub("[[:punct:]], "", a.vector) etc etc. This is painful. I was looking at this question & answer: R: gsub, pattern = vector and replacement = vector but it's not solving the problem. Neither the mapply nor the mgsub are working. I made these vectors remove <- c("@\\w+", "http\\w+", "[[:punct:]]") substitute <

For loop and gsub R

阅读更多关于 For loop and gsub R

问题 I have a question about having a gsub in a for loop in R. I have a dataframe ( catalog ) with "sku" and "cat" columns, they are a sku ID and a Catalog ID for the same product from different sources. I then have a dataframe ( image_data ) with sku, and image descriptions (image_data). I want to create a new column( new_image_description ) where all instances of sku's is replaced by the corresponding catalog number (see bellow) from the column image_des . But only replaces some and other not.