问题
I'm trying to extract character before and after "/" with no success. Sentences are:
XXXX YYY ZZZ - AV HAHEHRS, 3061 - SDDW ASDA DDSF - SAO JOSE DOS CAMPOS / SP - CEP: 00000-000
Output should be
SAO JOSE DOS CAMPOS / SP
I'm trying str_extract(str, "- [a-zA-Z]{1,} / [a-zA-Z]{1,}") but it's just bringing me
CAMPOS / SP
回答1:
In your regex there is the space missing. Try:
str_extract(str, "- [a-zA-Z ]+ / [a-zA-Z ]+")
Note the space in the character class. Also, {1,} is the long form of +.
The match will be "- SAO JOSE DOS CAMPOS / SP - CEP". You must get rid of the - in a second step, or use a zero-width look-behind:
str_extract(str, "(?<=- )[a-zA-Z ]+ / [a-zA-Z ]+")
Look-behinds are supported by gregexpr.
For the sake of completeness, you could do this without regex: Split the input by '-', find the part that contains '/', trim. This might be faster than regex, too.
来源:https://stackoverflow.com/questions/48088845/extract-character-before-and-after