问题
I wish to extract numbers with any decimals (at least one number both sides of the decimal), but not patterns followed by percentages. Therefore, I believe I need a negative lookahead (so it can see if the number is followed by a percentage sign).
For clarity, I would want to extract "123.123"
, but would not like to extract "123.123%"
I have tried a dozen syntax arrangements but cannot find the one that works. This successfully extracts the decimal pattern.
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+")
But I want to adapt it to return the second item only (since the first contains a percentage sign.
I have tried various combinations of the following:
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+(!?=%)")
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+[!?%]")
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+!?%")
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+!?\\%")
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+(!?=\\%)")
# etc
回答1:
You may use
"\\d+\\.\\d++(?!%)"
The \d++(?!%)
part matches 1 or more digits possessively and the (?!%)
negative lookahead is executed once after all those digits are matched and fails the match if there is a %
after them.
The same can be written without a possessive quantifier as "\\d+\\.\\d+(?![%\\d])"
, where the (?![%\\d])
will also fail the match if there is a digit immediately to the right of the current location.
R demo:
> library(stringr)
> c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d++(?!%)")
[[1]]
character(0)
[[2]]
[1] "123.123"
回答2:
Are we allowed to just use a stop character, if there is nothing else that can follow the number we may be okay.
c("123.123%", "123.123") %>% str_extract_all(., "\\d+\\.\\d+$")
[[1]] character(0)
[[2]] [1] "123.123"
回答3:
We can fix with adding the ^
and $
at the beginning and end of the string in pattern
c("123.123%", "123.123") %>%
str_extract_all(., "^[0-9]+\\.[0-9]+$")
来源:https://stackoverflow.com/questions/54552393/negative-lookahead-in-regex-to-exclude-percentage-in-r