gsub

Replace characters in column names gsub

隐身守侯 提交于 2019-11-29 02:14:28
I am reading in a bunch of CSVs that have stuff like "sales - thousands" in the title and come into R as "sales...thousands". I'd like to use a regular expression (or other simple method) to clean these up. I can't figure out why this doesn't work: #mock data a <- data.frame(this.is.fine = letters[1:5], this...one...isnt = LETTERS[1:5]) #column names colnames(a) # [1] "this.is.fine" "this...one...isnt" #function to remove multiple spaces colClean <- function(x){ colnames(x) <- gsub("\\.\\.+", ".", colnames(x)) } #run function colClean(a) #names go unaffected colnames(a) # [1] "this.is.fine"

How to remove specific special characters in R

北战南征 提交于 2019-11-29 01:31:23
I have some sentences like this one. c = "In Acid-base reaction (page[4]), why does it create water and not H+?" I want to remove all special characters except for '?&+-/ I know that if I want to remove all special characters, I can simply use gsub("[[:punct:]]", "", c) "In Acidbase reaction page4 why does it create water and not H" However, some special characters such as + - ? are also removed, which I intend to keep. I tried to create a string of special characters that I can use in some code like this gsub("[special_string]", "", c) The best I can do is to come up with this cat("!\"#$%()*,

replacing the `'` char using awk

无人久伴 提交于 2019-11-28 20:32:07
I have lines with a single : and a ' in them that I want to get rid of. I want to use awk for this. I've tried using: awk '{gsub ( "[:\\']","" ) ; print $0 }' and awk '{gsub ( "[:\']","" ) ; print $0 }' and awk '{gsub ( "[:']","" ) ; print $0 }' non of them worked, but return the error Unmatched ". . when I put awk '{gsub ( "[:_]","" ) ; print $0 }' then It works and removes all : and _ chars. How can I get rid of the ' char? Dimitre Radoulov You could use: Octal code for the single quote: [:\47] The single quote inside double quotes, but in that case special characters will be expanded by the

Writing R function with if enviornment

六月ゝ 毕业季﹏ 提交于 2019-11-28 14:31:07
I am trying to write a function which does different things, depending on the second argument. But I am getting an error all the time. Depending on the dimension of the matrix, the function should perform different tasks. Here is an example x<-cbind(X1,X2,X3) function<-function(x,hnrstr){ if (hnrstr<-1){ x<-data.frame(X1=x[1],X2=x[2],X3=x[3]) y<-x y[ ,"X2","X3"]<- gsub(" {2, }"," ",y[ ,"X2","X3"]) } if (hnrstr<-2){ x<-data.frame(X1=x[1],X2=x[2]) P<-x } if (hnrstr<-1){ x<-y } if (hnrstr<-2){ x<-P } return(x) } apply(x,c(3,3), function(x,1)) I am getting the error: Error in drop && !has.j :

Replace first occurrence of “:” but not second in R

岁酱吖の 提交于 2019-11-28 13:25:48
In order to be able to process I'd like to replace the first occurrence of a : in a string (which is my marker, that a speech begins). text <- c("Mr. Mark Francois (Rayleigh) (Con): If the scheme was so poorly targeted, why were the Government about to roll it out to employees in the Department of Trade and Industry and the Department for Work and Pensions on the very day the Treasury scrapped it? The CBI and the TUC have endorsed the scheme, which has helped 500,000 people and their families to improve their computer skills. When the Chancellor announced the original concession, he told the

R remove multiple text strings in data frame

◇◆丶佛笑我妖孽 提交于 2019-11-28 12:48:05
New to R. I am looking to remove certain words from a data frame. Since there are multiple words, I would like to define this list of words as a string, and use gsub to remove. Then convert back to a dataframe and maintain same structure. wordstoremove <- c("ai", "computing", "ulitzer", "ibm", "privacy", "cognitive") a id text time username 1 "ai and x" 10 "me" 2 "and computing" 5 "you" 3 "nothing" 15 "everyone" 4 "ibm privacy" 0 "know" I was thinking something like: a2 <- apply(a, 1, gsub(wordstoremove, "", a) but clearly this doesnt work, before converting back to a data frame. wordstoremove

Extract part of string (till the first semicolon) in R

主宰稳场 提交于 2019-11-28 10:17:06
I have a column containing values of 3 strings separated by semicolons. I need to just extract the first part of the string. Type <- c("SNSR_RMIN_PSX150Y_CSH;SP_12;I0.00V50HX0HY3000") What I want is: Get the first part of the string (till the first semicolon). Output : SNSR_RMIN_PSX150Y_CSH I tried gsub but not able to understand. Kindly let me know how we can do this efficiently in R. You could try sub sub(';.*$','', Type) #[1] "SNSR_RMIN_PSX150Y_CSH" It will match the pattern i.e. first occurence of ; to the end of the string and replace with '' Or use library(stringi) stri_extract(Type,

R: rename subset of variables in data frame

笑着哭i 提交于 2019-11-28 09:23:31
问题 I'm renaming the majority of the variables in a data frame and I'm not really impressed with my method. Therefore, does anyone on SO have a smarter or faster way then the one presented below using only base? data(mtcars) # head(mtcars) temp.mtcars <- mtcars names(temp.mtcars) <- c((x <- c("mpg", "cyl", "disp")), gsub('^', "baR.", setdiff(names (mtcars),x))) str(temp.mtcars) 'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : num 6 6 4 6

Escaping Apostrophes Using Gsub

北战南征 提交于 2019-11-28 08:30:49
问题 I'm working in Ruby and I'm trying to escape ' characters to \' so that I can use them in SQL. I'm trying to use gsub , but it doesn't seem to be working. "this doesn't work".gsub /'/, '\\'' #=> "this doesnt workt work" "this doesn't work".gsub /'/, '\\\'' #=> "this doesnt workt work" "this doesn't work".gsub /'/, '\\\\'' #=> "this doesn\\'t work" "this doesn't work".gsub /'/, '\\\\\'' #=> "this doesn\\'t work" I don't know if gsub is even the right method to be using, so I'm willing to try

Removing Whitespace From a Whole Data Frame in R

不打扰是莪最后的温柔 提交于 2019-11-28 07:44:00
I've been trying to remove the white space that I have in a data frame (using R) . The data frame is large (>1gb) and has multiple columns that contains white space in every data entry. Is there a quick way to remove the white space from the whole data frame? I've been trying to do this on a subset of the first 10 rows of data using: gsub( " ", "", mydata) This didn't seem to work, although R returned an output which I have been unable to interpret. str_replace( " ", "", mydata) R returned 47 warnings and did not remove the white space. erase_all(mydata, " ") R returned an error saying 'Error: