Extract numbers between brackets within a string [duplicate]

前端未结

关注

 2  849

眼角桃花

相关标签:

2条回答

暗喜

2020-12-10 20:39
Here is a gsubfn solution:
```
library(gsubfn)

strapplyc(x, "[(](\\d+)[)]", simplify = TRUE)
```
[(] matches an open paren, (\\d+) matches a string of digits creating a back-reference owing to the parens around it and finally [)] matches a close paren. The back-reference is returned.
0 讨论(0)
发布评论:

提交评论
- 加载中...
慢半拍i

2020-12-10 20:53
There are many possible regular expressions to do this. Here is one:
```
x=c("East Kootenay C (5901035) RDA 01011","Thompson-Nicola J (Copper Desert Country) (5933039) RDA 02020")

> gsub('.+\$([0-9]+)\$.+?$', '\\1', x)
[1] "5901035" "5933039"
```
Lets break down the syntax of that first expression '.+\$([0-9]+)\$.+'
- .+ one or more of anything
- \\( parentheses are special characters in a regular expression, so if I want to represent the actual thing ( I need to escape it with a \. I have to escape it again for R (hence the two \s).
- ([0-9]+) I mentioned special characters, here I use two. the first is the parentheses which indicate a group I want to keep. The second [ and ] surround groups of things. see ?regex for more information.
- ?$ The final piece assures that I am grabbing the LAST set of numbers in parens as noted in the comments.
I could also use * instead of . which would mean 0 or more rather than one or more i in case your paren string comes at the beginning or end of a string.

The second piece of the gsub is what I am replacing the first portion with. I used: \\1. This says use group 1 (the stuff inside the ( ) from above. I need to escape it twice again, once for the regex and once for R.

Clear as mud to be sure! Enjoy your data munging project!
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题