Matching an apostrophe only within a word or string

问题

I'm looking for a Python regex that can match 'didn't' and returns only the character that is immediately preceded by an apostrophe, like 't, but not the 'd or t' at the beginning and end.

I have tried (?=.*\w)^(\w|')+$ but it only matches the apostrophe at the beginning.

Some more examples:

'I'm' should only match 'm and not 'I

'Erick's' should only return 's and not 'E

The text will always start and end with an apostrophe and can include apostrophes within the text.

回答1:

To match an apostrophe inside a whole string = match it anwyhere but at the start/end of the string:

(?!^)'(?!$)

See the regex demo.

Often, the apostophe is searched only inside a word (but in fact, a pair of words where the second one is shortened), then you may use

\b'\b

See this regex demo. Here, the ' is preceded and followed with a word boundary, so that ' could be preceded with any word, letter or _ char. Yes, _ char and digits are allowed to be on both sides.

If you need to match a ' only between two letters, use

(?<=[A-Za-z])'(?=[A-Za-z])    # ASCII only
(?<=[^\W\d_])'(?=[^\W\d_])    # Any Unicode letters

See this regex demo.

As for this current question, here is a bunch of possible solutions:

import re

s = "'didn't'"
print(s.strip("'")[s.strip("'").find("'")+1])
print(re.search(r'\b\'(\w)', s).group(1))
print(re.search(r'\b\'([^\W\d_])', s).group(1))
print(re.search(r'\b\'([a-z])', s, flags=re.I).group(1))
print(re.findall(r'\b\'([a-z])', "'didn't know I'm a student'", flags=re.I))

The s.strip("'")[s.strip("'").find("'")+1] gets the character after the first ' after stripping the leading/trailing apostrophes.

The re.search(r'\b\'(\w)', s).group(1) solution gets the word (i.e. [a-zA-Z0-9_], can be adjusted from here) char after a ' that is preceded with a word char (due to the \b word boundary).

The re.search(r'\b\'([^\W\d_])', s).group(1) is almost identical to the above solution, it only fetches a letter character as [^\W\d_] matches any char other than a non-word, digit and _.

Note that the re.search(r'\b\'([a-z])', s, flags=re.I).group(1) solution is next to identical to the above one, but you cannot make it Unicode aware with re.UNICODE.

The last re.findall(r'\b\'([a-z])', "'didn't know I'm a student'", flags=re.I) just shows how to fetch multiple letter chars from a string input.

来源：https://stackoverflow.com/questions/38758873/matching-an-apostrophe-only-within-a-word-or-string

标签

python

regex