gsub

r - remove last but one character from every field

那年仲夏 提交于 2019-12-25 07:17:37
问题 the substr() function in R can isolate any character by position e.g. substr(df$10,2,3) or by using nchar() it is possible to work backwards from the end of the field to isolate a character in a position such as last but one using: substr(df$10,nchar(df$10)-2,nchar(df$10)-1) however I would like to know how to simply remove the last but one character of every field for a column in a dataframe. I am having difficulty doing this. any help would be great! 回答1: You can use a regular expression

r - remove last but one character from every field

北城以北 提交于 2019-12-25 07:17:16
问题 the substr() function in R can isolate any character by position e.g. substr(df$10,2,3) or by using nchar() it is possible to work backwards from the end of the field to isolate a character in a position such as last but one using: substr(df$10,nchar(df$10)-2,nchar(df$10)-1) however I would like to know how to simply remove the last but one character of every field for a column in a dataframe. I am having difficulty doing this. any help would be great! 回答1: You can use a regular expression

Regex gsub R differentiate between ellipsis and periods

扶醉桌前 提交于 2019-12-25 03:35:34
问题 text="stack overflow... is a popular website." I want to separate punctuation marks from words. The output should be: "stack overflow ... is a popular website . " Of course, the command gsub("\\.", " \\. ", text, fixed = FALSE) returns: "stack overflow . . . is a popular website . " because it does not differentiate between periods and ellipsis (suspension points). In short, when three periods are found together in the text, R should consider them as a single punctuation mark. 回答1: I think a

Ruby regex for stripping BBCode

久未见 提交于 2019-12-25 02:57:45
问题 I'm trying to remove BBCode from a given string (just using gsub with some regex). Here's an example string: The [b]quick[/b] brown [url=http://example.com]fox[/url] jumps over the lazy dog [img=http://example.com/lazy_dog.png] And what I need that to output is: The quick brown fox jumps over the lazy dog So what's a way to do that? I've found various examples of doing this, but none have worked for my use case. One that I've tried: /\[(\w+)[^w]*?](.*?)\[\/\1]/ But that wouldn't catch the

R: gsub/replace only those occurrences following a keyword occurrence

删除回忆录丶 提交于 2019-12-24 21:52:02
问题 I only want to replace string occurrences that follow a particular keyword/pattern and not before. in other words, do nothing until the first occurrence of the keyword-pattern, and then start to gsub to the right of that keyword-pattern. See below: gsub("\\[|\\]", "", "ab[ cd] ef keyword [ gh ]keyword ij ") Actual results: "ab cd ef keyword gh keyword ij " Desired results: "ab[ cd] [][asfg] ]] ef keyword gh keyword ij " [Edited to fix the results. I don't want to remove 'keyword'] [Edited to

Changing row names in a data_frame from letters to numbers in R

我的梦境 提交于 2019-12-24 21:14:31
问题 I have a group of datasets, from a survey applied to many different countries, which I want to combine to create a single merged data.frame. Unfortunately, for one of them , the variable names is different from the others, but it follows a pattern: as in the others the names of the variables are like "VAR1", "VAR2", etc., in this one their names are "VAR_a", "VAR_b", etc. The code I've used so far to solve this problem is something like: names (df) <- gsub("_a", "01", names(df)) names (df) <-

R: gsub of exact full string with fixed = T

╄→尐↘猪︶ㄣ 提交于 2019-12-24 17:53:09
问题 I am trying to gsub exact FULL string - I know I need to use ^ and $ . The problem is that I have special characters in strings (could be [ , or . ) so I need to use fixed=T . This overrides the ^ and $ . Any solution is appreciated. Need to replace 1st, 2nd element in exact_orig with 1st, 2nd element from exact_change but only if full string is matched from beginning to end. exact_orig = c("oz","32 oz") exact_change = c("20 oz","32 ct") gsub_FixedTrue <- function(i) { for(k in seq_along

can't remove blank lines in txt file with R

眉间皱痕 提交于 2019-12-24 07:16:34
问题 I am doing a text analysis with R and needed to convert the first letters of the sentences into lowercase while keeping the other capitalized words the way they are. So I used the command x <- gsub("(\\..*?[A-Z])", '\\L\\1', x, perl=TRUE) which worked, but partially. The problem is that for the text analysis I had to convert the pdf files into txt format and now the txt files contain a lot of empty lines (page breaks, returns possibly), and therefore the command I used does not convert the

Removing a pattern With gsub in r

元气小坏坏 提交于 2019-12-24 03:48:36
问题 I have a string Project Change Request (PCR) - HONDA DIGITAL PLATEFORM saved in supp_matches , and supp_matches1 contains the string Project Change Request (PCR) - . supp_matches2 <- gsub("^.*[supp_matches1]","",supp_matches) supp_matches2 # [1] " (PCR) - HONDA DIGITAL PLATEFORM" Which is actually not correct but it should come like supp_matches2 # [1] "HONDA DIGITAL PLATEFORM" Why is it not coming the way it should be? 回答1: As I say in my comment, in your expression gsub("^.*[supp_matches1]"

Regular expression for the “opposite” result

☆樱花仙子☆ 提交于 2019-12-24 01:47:15
问题 Take the following character vector x x <- c("1 Date in the form", "2 Number of game", "3 Day of week", "4-5 Visiting team and league") My desired result is the following vector, with the first capitalized word from each string and, if the string contains a - , also the last word. [1] "Date" "Number" "Day" "Visiting" "league" So instead of doing unlist(sapply(strsplit(x, "[[:blank:]]+|, "), function(y){ if(grepl("[-]", y[1])) c(y[2], tail(y,1)) else y[2] })) to get the result, I figured I