gsub

Regex return file name, remove path and file extension

删除回忆录丶 提交于 2020-01-09 04:46:07
问题 I have a data.frame that contains a text column of file names. I would like to return the file name without the path or the file extension. Typically, my file names have been numbered, but they don't have to be. For example: df<-data.frame(data=c("a","b"),fileNames=c("C:/a/bb/ccc/NAME1.ext","C:/a/bb/ccc/d D2/name2.ext")) I would like to return the equivalent of df<-data.frame(data=c("a","b"),fileNames=c("NAME","name")) but I cannot figure out the slick regular expression to do this with gsub.

Using gsub on columns in R

白昼怎懂夜的黑 提交于 2020-01-07 08:49:33
问题 I have a data frame (data) in R with thousands of rows and 10 columns. 9 of the columns contain factors with several levels. Here is a small portion of the data frame. A gr1 10 303.90 11 304.1 12 303.6 13 303.90 obs 14 303.90k As an example, one factor has a level that is "303.90" and another level that is "303.90 obs". I want to change the "303.90 obs" to "303.90". I am using the following command to edit the names of the level. data[] = as.data.frame(lapply(data, function(x) {x = gsub("303

Recursive regex in R for curly braces

南楼画角 提交于 2020-01-05 03:54:04
问题 I have some text string in the following pattern. x = "sdfwervd \calculus{fff}{\trt{sdfsdf} & \trt{sdfsdf} & \trt{sdfsdf} \\{} sdfsdf & sdfsdf & sefgse3 } aserdd wersdf sewtgdf" I want to use regex to capture the text "fff" in the string \calculus{fff} and replace it with something else. Further I want to capture the string between the first { after \calculus{.+} and it's corresponding closing curly brace } . How to do this with regex in R ? The following captures everything till last curly

Splitting the values in column using regex

落花浮王杯 提交于 2020-01-03 17:42:31
问题 I have data.frame with two columns like the following dat ID Details id_1 box1_homodomain gn=box1 os=homo sapiens p=4 se=1 id_2 sox2_plurinet gn=plu os=mus musculus p=5 se=3 I would like to split the "os=xxx" and gn="yyy" in column "Details" for all the ids and print it like following: Id Description gn os Îd_1 box1_homodomain box1 homo sapiens Id_2 sox2_plurinet plu mouse musculus I tried the using gsub approach in R but I am unable to split the os=homo sapiens and gn=box1 into their

Remove specific last character from string

半世苍凉 提交于 2020-01-03 10:45:30
问题 I have following string vector: EC02 502R 603 515 602 KL07 601 511R 505R 506R 503 508 514 501 509R 510 501R 512R 516 507 604 502 601R SPK01 504 504R ACK01 503R 508R 507R ACK03 513 EC01 506 ECH01 ACK02 SPK02 509 511 512 505 KA01 RS01 510R SKL01 SPK03 603R 602R 604R 513R AECH01 ER03 AECH02 RS02 514R ER01 RH01 AR05 RH02 515R ER02 M01 I want to replace 502R to 502, 501R to 501, 503R to 503 and so on... Only character R has to be replaced which is occurring at the end of the string. How can I do

Removing/replacing brackets from R string using gsub

萝らか妹 提交于 2020-01-02 05:58:16
问题 I want to remove or replace brackets "(" or ")" from my string using gsub. However as shown below it is not working. What could be the reason? > k<-"(abc)" > t<-gsub("()","",k) > t [1] "(abc)" 回答1: Using the correct regex works: gsub("[()]", "", "(abc)") The additional square brackets mean "match any of the characters inside" . 回答2: The possible way could be (in the line OP is trying) as: gsub("\\(|)","","(abc)") #[1] "abc" `\(` => look for `(` character. `\` is needed as `(` a special

Escaping strings for gsub

岁酱吖の 提交于 2020-01-01 09:55:09
问题 I read a file: local logfile = io.open("log.txt", "r") data = logfile:read("*a") print(data) output: ... "(\.)\n(\w)", r"\1 \2" "\n[^\t]", "", x, re.S ... Yes, logfile looks awful as it's full of various commands How can I call gsub and remove i.e. "(\.)\n(\w)", r"\1 \2" line from data variable? Below snippet, does not work: s='"(\.)\n(\w)", r"\1 \2"' data=data:gsub(s, '') I guess some escaping needs to be done. Any easy solution? Update : local data = [["(\.)\n(\w)", r"\1 \2" "\n[^\t]", "",

replace words in R data.frames (Text Mining)

≯℡__Kan透↙ 提交于 2020-01-01 07:23:12
问题 I'm working on a Text Mining Solution with SQL and R. First I Import Data into R from my SQL selection and than I do data mining stuff with it. Here is what I got: rawData = sqlQuery(dwhConnect,sqlString) a = data.frame(rawData$ENNOTE_NEU) If I do a a[[1]][1:3] you see the structure: [1] lorem ipsum li ld ee wö wo di dd [2] la kdin di da dogs chicken [3] kd good i need some help Now I want to do some data cleaning with my own dictionary. An Example would be to replace li with lorem ipsum and

replace words in R data.frames (Text Mining)

陌路散爱 提交于 2020-01-01 07:23:10
问题 I'm working on a Text Mining Solution with SQL and R. First I Import Data into R from my SQL selection and than I do data mining stuff with it. Here is what I got: rawData = sqlQuery(dwhConnect,sqlString) a = data.frame(rawData$ENNOTE_NEU) If I do a a[[1]][1:3] you see the structure: [1] lorem ipsum li ld ee wö wo di dd [2] la kdin di da dogs chicken [3] kd good i need some help Now I want to do some data cleaning with my own dictionary. An Example would be to replace li with lorem ipsum and

How to gsub on the text between two words in R?

左心房为你撑大大i 提交于 2019-12-31 03:32:09
问题 EDIT: I would like to place a \n before a specific unknown word in my text. I know that the first time the unknown word appears in my text will be between "Tree" and "Lake" Ex. of text: text [1] "TreeRULakeSunWater" [2] "A B C D" EDIT: "Tree" and "Lake" will never change, but the word in between them is always changing so I do not look for "RU" in my regex What I am currently doing: if (grepl(".*Tree\\s*|Lake.*", text)) { text <- gsub(".*Tree\\s*|Lake.*", "\n\\1", text)} The problem with what