strsplit

splitting string expression at multiple delimiters in R

我与影子孤独终老i 提交于 2019-12-02 07:59:11
I am trying to parse some math expressions in R, and I would therefore like to split them at multiple delimiters +,-,*,/, -(, +(, ), )+ etc so that I get the list of symbolic variables contained in the expression. so e.g. I would like 2*(x1+x2-3*x3) to return "x1", "x2", "x3" Is there a good way of doing it? Thanks. There's probably a cleaner way of doing this, but does this cover your use case(s)? eqn = "3 + 2*(x1+x2-3*x3 - x1/x3) - 5" vars = unlist(strsplit(eqn, split="[-+*/)( ]|[^x][0-9]+|^[0-9]+")) vars = vars[nchar(vars)>0] # To remove empty strings vars [1] "x1" "x2" "x3" "x1" "x3" If

strsplit by spaces greater than one in R

﹥>﹥吖頭↗ 提交于 2019-12-02 04:55:11
Given a string, mystr = "Average student score 88" I wish to split if there are more than 1 space. I wish to obtain the following: "Average student score" "88" I searched that "\s+" will split by any number of spaces. strsplit(mystr, "\\s+") But this is not what I want. Is there any option within strsplit that can split strings based on a certain number of spaces (say space = k) or a rule on spaces (say space > 1)? Avinash Raj You may specify it through a repetition quantifier. strsplit(mystr, "\\s{2,}") \\s{2,} regex should match two or more spaces. 来源: https://stackoverflow.com/questions

strsplit and lapply

给你一囗甜甜゛ 提交于 2019-12-02 04:54:32
I have a string in some text of the form "12,34,77" , including the quotation marks. I need to get the values of each of those numbers into a list. I tried using lapply and strsplit : control2=lapply(strsplit(data$values,","),as.numeric) but I get the error: non character argument What am I doing wrong? 1) strapply 1a) scalar Here is a one-liner using strapply from the gsubfn package: library(gsubfn) x <- '"12,34,567"' strapply(x, "\\d+", as.numeric, simplify = c) ## [1] 12 34 567 1b) vectorized A vectorized version is even simpler -- just remove the simplify=c like this: v <- c('"1,2,3"', '"8

How to extract numbers from text? [duplicate]

岁酱吖の 提交于 2019-12-01 14:50:17
This question already has an answer here: Extracting unique numbers from string in R 7 answers i have the flowing text string: string <- "['CBOE SHORT-TERM VIX FUTURE DEC 2016', 81.64],\n\n ['CBOE SHORT-TERM VIX FUTURE JAN 2017', 18.36]" is there a simple way of extracting numerical elements from text without having to use: string_table <- strsplit(string, " ") and then select n-th element and continue to strsplit until i have what i need. the result should be: result <- c(2016, 81, 64, 2017, 18, 36) thank you. We can use str_extract_all by specifying the pattern as one or more number ( [0-9]+

How to split a string on first number only

自作多情 提交于 2019-12-01 03:59:49
So i have a dataset with street adresses, they are formatted very differently. For example: d <- c("street1234", "Street 423", "Long Street 12-14", "Road 18A", "Road 12 - 15", "Road 1/2") From this I want to create two columns. 1. X: with the street address and 2. Y: with the number + everything that follows. Like this: X Y Street 1234 Street 423 Long Street 12-14 Road 18A Road 12 - 15 Road 1/2 Until now I have tried strsplit and followed some similar questions here , for example: strsplit(d, split = "(?<=[a-zA-Z])(?=[0-9])", perl = T)) . I just can't seem to find the correct regular

strsplit with vertical bar (pipe)

陌路散爱 提交于 2019-12-01 03:50:28
Here, > r<-c("AAandBB", "BBandCC") > strsplit(as.character(r),'and') [[1]] [1] "AA" "BB" [[2]] [1] "BB" "CC" Working well, but > r<-c("AA|andBB", "BB|andCC") > strsplit(as.character(r),'|and') [[1]] [1] "A" "A" "|" "" "B" "B" [[2]] [1] "B" "B" "|" "" "C" "C" Here, the answer is not correct. How to get "AA" and "BB", when I use '|and'? Thanks in advance. As you can read on ?strsplit, the argument split in function strsplit is a regular expression . Hence either you need to escape the vertical bar (it is a special character) strsplit(r,split='\\|and') or you can choose fixed=TRUE to indicate

strsplit with vertical bar (pipe)

让人想犯罪 __ 提交于 2019-12-01 00:56:03
问题 Here, > r<-c("AAandBB", "BBandCC") > strsplit(as.character(r),'and') [[1]] [1] "AA" "BB" [[2]] [1] "BB" "CC" Working well, but > r<-c("AA|andBB", "BB|andCC") > strsplit(as.character(r),'|and') [[1]] [1] "A" "A" "|" "" "B" "B" [[2]] [1] "B" "B" "|" "" "C" "C" Here, the answer is not correct. How to get "AA" and "BB", when I use '|and'? Thanks in advance. 回答1: As you can read on ?strsplit, the argument split in function strsplit is a regular expression. Hence either you need to escape the

Extracting nth element from a nested list following strsplit - R

故事扮演 提交于 2019-11-30 14:01:27
I've been trying to understand how to deal with the output of strsplit a bit better. I often have data such as this that I wish to split: mydata <- c("144/4/5", "154/2", "146/3/5", "142", "143/4", "DNB", "90") #[1] "144/4/5" "154/2" "146/3/5" "142" "143/4" "DNB" "90" After splitting that the results are as follows: strsplit(mydata, "/") #[[1]] #[1] "144" "4" "5" #[[2]] #[1] "154" "2" #[[3]] #[1] "146" "3" "5" #[[4]] #[1] "142" #[[5]] #[1] "143" "4" #[[6]] #[1] "DNB" #[[7]] #[1] "90" I know from the strsplit help guide that final empty strings are not produced. Therefore, there will be 1, 2 or

apply strsplit rowwise

徘徊边缘 提交于 2019-11-29 21:54:00
Im trying to split a string on "." and create additional columns with the two strings before and after ".". tes<-c("1.abc","2.di","3.lik") dat<-c(5,3,2) h<-data.frame(tes,dat) h$num<-substr(h$tes,1,1) h$prim<-unlist(strsplit(as.character(h$tes),"\\."))[2] h$prim<-sapply(h$tes,unlist(strsplit(as.character(h$tes),"\\."))[2]) I´d like h$prim to contain "abc","di","lik"..However I´m not able to figure it out. I guess strsplit is not vectorized, but then I thought the sapply version should have worked. However I assume it should be easy:-) Regards, //M This should do the trick R> sapply(strsplit(as

Extracting nth element from a nested list following strsplit - R

﹥>﹥吖頭↗ 提交于 2019-11-29 19:51:35
问题 I've been trying to understand how to deal with the output of strsplit a bit better. I often have data such as this that I wish to split: mydata <- c("144/4/5", "154/2", "146/3/5", "142", "143/4", "DNB", "90") #[1] "144/4/5" "154/2" "146/3/5" "142" "143/4" "DNB" "90" After splitting that the results are as follows: strsplit(mydata, "/") #[[1]] #[1] "144" "4" "5" #[[2]] #[1] "154" "2" #[[3]] #[1] "146" "3" "5" #[[4]] #[1] "142" #[[5]] #[1] "143" "4" #[[6]] #[1] "DNB" #[[7]] #[1] "90" I know