gsub | 易学教程

Rails 4, replacing “\n” with “
” for use in XML spreadsheets in excel

阅读更多关于 Rails 4, replacing “\n” with “ ” for use in XML spreadsheets in excel

问题 I'm writing an application that exports certain data into a specifically formatted Excel document. I'm building an XML spreadsheet file using this document (http://msdn.microsoft.com/en-us/library/aa140066(v=office.10).aspx#odc_xmlss_x:pagesetup) and so far have it all working. BUT to get it working I had to use ".html_safe" in several fields which is dangerous in this instance as some of the fields being exported contain user entered data. Basically to get a new line inside an excel cell you

change string in DF using hive command and mutate with sparklyr

阅读更多关于 change string in DF using hive command and mutate with sparklyr

问题 Using the Hive command regexp_extract I am trying to change the following strings from: 201703170455 to 2017-03-17:04:55 and from: 2017031704555675 to 2017-03-17:04:55.0010 I am doing this in sparklyr trying to use this code that works with gsub in R: newdf<-df%>%mutate(Time1 = regexp_extract(Time, "(....)(..)(..)(..)(..)", "\\1-\\2-\\3:\\4:\\5")) and this code: newdf<-df%>mutate(TimeTrans = regexp_extract("(....)(..)(..)(..)(..)(....)", "\\1-\\2-\\3:\\4:\\5.\\6")) but does not work at all.

R: gsub with fixed=T or F and special cases

阅读更多关于 R: gsub with fixed=T or F and special cases

问题 Building on top of two questions I previously asked: R: How to prevent memory overflow when using mgsub in vector mode? gsub speed vs pattern length I do like suggestions on usage of fixed=TRUE by @Tyler as it speeds up calculations significantly. However, it's not always applicable. I need to substitute, say, caps as a stand-alone word w/ or w/o punctuation that surrounds it. A priori it's not know what can follow or precede the word, but it must be any of regular punctuation signs (, . ! -

R gsub to extract emails from text

阅读更多关于 R gsub to extract emails from text

问题 I have a variable a created by readLines of a file which contains some emails. I already filtered only those rows whith the @ symbol, and now am struggling to grab the emails. The text in my variable looks like this: > dput(a[1:5]) c("buenas tardes. excelente. por favor a: Saolonm@hotmail.com", "26.leonard@gmail.com ", "Aprecio tu aporte , mi correo es jcdavola31@gmail.com , Muchas Gracias", "gracias andrescarnederes@headset.cl", "Me apunto, muchas gracias mi direcciÃ³n luciana.chavela

In regex, mystery Error: assertion 'tree->num_tags == num_tags' failed in executing regexp: file 'tre-compile.c', line 634

阅读更多关于 In regex, mystery Error: assertion 'tree->num_tags == num_tags' failed in executing regexp: file 'tre-compile.c', line 634

问题 Assume 900+ company names pasted together to form a regex pattern using the pipe separator -- "firm.pat". firm.pat <- str_c(firms$firm, collapse = "|") With a data frame called "bio" that has a large character variable (250 rows each with 100+ words) named "comment", I would like to replace all the company names with blanks. Both a gsub call and a str_replace_all call return the same mysterious error. bio$comment <- gsub(pattern = firm.pat, x = bio$comment, replacement = "") Error in gsub

Proper use of gsub / regular expressions in R?

阅读更多关于 Proper use of gsub / regular expressions in R?

问题 I have long lists of strings such as this machine readable example: A <- list(c("Biology","Cell Biology","Art","Humanities, Multidisciplinary; Psychology, Experimental","Astronomy & Astrophysics; Physics, Particles & Fields","Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods","Geriatrics & Gerontology","Gerontology","Management","Operations Research & Management Science","Computer Science, Artificial Intelligence; Computer Science, Information

Using more than nine back references in an R regex

阅读更多关于 Using more than nine back references in an R regex

问题 The code below does not work, because the replacement string for \10, \11, and so on, cannot be read properly. It reads \10 as \1 and print 0 instead, can you help me fix it? There is an answer in one of the threads, saying that I am supposed to use capturing or naming groups, but I don't really understand how to use them. headline <- gsub("regexp with 10 () brackets", "\\1 ### \\2 ### \\3 ### \\4 ### \\5 ### \\6 ### \\7 ### \\8 ### \\9 ### \\10### \\11### \\12### \\13### \\14### \\15### \\16

Add space between two letters in a string in R [duplicate]

阅读更多关于 Add space between two letters in a string in R [duplicate]

问题 This question already has answers here : Use regex to insert space between collapsed words (2 answers) Closed 2 years ago . Suppose I have a string like s = "PleaseAddSpacesBetweenTheseWords" How do I use gsub in R add a space between the words so that I get "Please Add Spaces Between These Words" I should do something like gsub("[a-z][A-Z]", ???, s) What do I put for ???. Also, I find the regular expression documentation for R confusing so a reference or writeup on regular expressions in R

R extract first number from string

阅读更多关于 R extract first number from string

问题 I have a string in a variable which we call v1. This string states picture numbers and takes the form of "Pic 27 + 28". I want to extract the first number and store it in a new variable called item. Some code that I've tried is: item <- unique(na.omit(as.numeric(unlist(strsplit(unlist(v1),"[^0-9]+"))))) This worked fine, until I came upon a list that went: [1,] "Pic 26 + 25" [2,] "Pic 27 + 28" [3,] "Pic 28 + 27" [4,] "Pic 29 + 30" [5,] "Pic 30 + 29" [6,] "Pic 31 + 32" At this point I get more

R extract first number from string

阅读更多关于 R extract first number from string