gsub | 易学教程

Regex return file name, remove path and file extension

阅读更多关于 Regex return file name, remove path and file extension

问题 I have a data.frame that contains a text column of file names. I would like to return the file name without the path or the file extension. Typically, my file names have been numbered, but they don't have to be. For example: df<-data.frame(data=c("a","b"),fileNames=c("C:/a/bb/ccc/NAME1.ext","C:/a/bb/ccc/d D2/name2.ext")) I would like to return the equivalent of df<-data.frame(data=c("a","b"),fileNames=c("NAME","name")) but I cannot figure out the slick regular expression to do this with gsub.

Using gsub on columns in R

阅读更多关于 Using gsub on columns in R

问题 I have a data frame (data) in R with thousands of rows and 10 columns. 9 of the columns contain factors with several levels. Here is a small portion of the data frame. A gr1 10 303.90 11 304.1 12 303.6 13 303.90 obs 14 303.90k As an example, one factor has a level that is "303.90" and another level that is "303.90 obs". I want to change the "303.90 obs" to "303.90". I am using the following command to edit the names of the level. data[] = as.data.frame(lapply(data, function(x) {x = gsub("303

Recursive regex in R for curly braces

阅读更多关于 Recursive regex in R for curly braces

问题 I have some text string in the following pattern. x = "sdfwervd \calculus{fff}{\trt{sdfsdf} & \trt{sdfsdf} & \trt{sdfsdf} \\{} sdfsdf & sdfsdf & sefgse3 } aserdd wersdf sewtgdf" I want to use regex to capture the text "fff" in the string \calculus{fff} and replace it with something else. Further I want to capture the string between the first { after \calculus{.+} and it's corresponding closing curly brace } . How to do this with regex in R ? The following captures everything till last curly

Splitting the values in column using regex

阅读更多关于 Splitting the values in column using regex

问题 I have data.frame with two columns like the following dat ID Details id_1 box1_homodomain gn=box1 os=homo sapiens p=4 se=1 id_2 sox2_plurinet gn=plu os=mus musculus p=5 se=3 I would like to split the "os=xxx" and gn="yyy" in column "Details" for all the ids and print it like following: Id Description gn os Îd_1 box1_homodomain box1 homo sapiens Id_2 sox2_plurinet plu mouse musculus I tried the using gsub approach in R but I am unable to split the os=homo sapiens and gn=box1 into their

Remove specific last character from string

阅读更多关于 Remove specific last character from string

问题 I have following string vector: EC02 502R 603 515 602 KL07 601 511R 505R 506R 503 508 514 501 509R 510 501R 512R 516 507 604 502 601R SPK01 504 504R ACK01 503R 508R 507R ACK03 513 EC01 506 ECH01 ACK02 SPK02 509 511 512 505 KA01 RS01 510R SKL01 SPK03 603R 602R 604R 513R AECH01 ER03 AECH02 RS02 514R ER01 RH01 AR05 RH02 515R ER02 M01 I want to replace 502R to 502, 501R to 501, 503R to 503 and so on... Only character R has to be replaced which is occurring at the end of the string. How can I do

Removing/replacing brackets from R string using gsub

阅读更多关于 Removing/replacing brackets from R string using gsub

问题 I want to remove or replace brackets "(" or ")" from my string using gsub. However as shown below it is not working. What could be the reason? > k<-"(abc)" > t<-gsub("()","",k) > t [1] "(abc)" 回答1: Using the correct regex works: gsub("[()]", "", "(abc)") The additional square brackets mean "match any of the characters inside" . 回答2: The possible way could be (in the line OP is trying) as: gsub("\\(|)","","(abc)") #[1] "abc" `\(` => look for `(` character. `\` is needed as `(` a special

Escaping strings for gsub

阅读更多关于 Escaping strings for gsub

问题 I read a file: local logfile = io.open("log.txt", "r") data = logfile:read("*a") print(data) output: ... "(\.)\n(\w)", r"\1 \2" "\n[^\t]", "", x, re.S ... Yes, logfile looks awful as it's full of various commands How can I call gsub and remove i.e. "(\.)\n(\w)", r"\1 \2" line from data variable? Below snippet, does not work: s='"(\.)\n(\w)", r"\1 \2"' data=data:gsub(s, '') I guess some escaping needs to be done. Any easy solution? Update : local data = [["(\.)\n(\w)", r"\1 \2" "\n[^\t]", "",

replace words in R data.frames (Text Mining)

阅读更多关于 replace words in R data.frames (Text Mining)

问题 I'm working on a Text Mining Solution with SQL and R. First I Import Data into R from my SQL selection and than I do data mining stuff with it. Here is what I got: rawData = sqlQuery(dwhConnect,sqlString) a = data.frame(rawData$ENNOTE_NEU) If I do a a[[1]][1:3] you see the structure: [1] lorem ipsum li ld ee wö wo di dd [2] la kdin di da dogs chicken [3] kd good i need some help Now I want to do some data cleaning with my own dictionary. An Example would be to replace li with lorem ipsum and

replace words in R data.frames (Text Mining)

阅读更多关于 replace words in R data.frames (Text Mining)

How to gsub on the text between two words in R?

阅读更多关于 How to gsub on the text between two words in R?

问题 EDIT: I would like to place a \n before a specific unknown word in my text. I know that the first time the unknown word appears in my text will be between "Tree" and "Lake" Ex. of text: text [1] "TreeRULakeSunWater" [2] "A B C D" EDIT: "Tree" and "Lake" will never change, but the word in between them is always changing so I do not look for "RU" in my regex What I am currently doing: if (grepl(".*Tree\\s*|Lake.*", text)) { text <- gsub(".*Tree\\s*|Lake.*", "\n\\1", text)} The problem with what