How to get all overlapping matches in python regex that may start at the same location in a string?

前端 未结 2 1461
栀梦
栀梦 2020-12-20 00:06

How do I get all possible overlapping matches in a string in Python with multiple starting and ending points.

I\'ve tried using regex module, instead of default re m

2条回答
  •  孤城傲影
    2020-12-20 00:51

    With simple patterns like yours, you may generate slices of all consecutive chars in a string and test them all against a specific regex for a full match:

    import re
    
    def findall_overlapped(r, s):
      res = []                     # Resulting list
      reg = r'^{}$'.format(r)      # Regex must match full string
      for q in range(len(s)):      # Iterate over all chars in a string
        for w in range(q,len(s)):  # Iterate over the rest of the chars to the right
            cur = s[q:w+1]         # Currently tested slice
            if re.match(reg, cur): # If there is a full slice match
                res.append(cur)    # Append it to the resulting list
      return res
    
    rex = r'a\w+b'
    print(findall_overlapped(rex, 'axaybzb'))
    # => ['axayb', 'axaybzb', 'ayb', 'aybzb']
    

    See the Python demo

    WARNING: Note this won't work if you have a pattern checking left- or right-hand contexts, with lookaheads or lookbehinds on either end of the pattern since this context will be lost when iterating over the string.

提交回复
热议问题