How do I find the shortest overlapping match using regular expressions?

前端 未结 9 1123
無奈伤痛
無奈伤痛 2020-12-15 08:03

I\'m still relatively new to regex. I\'m trying to find the shortest string of text that matches a particular pattern, but am having trouble if the shortest pattern is a sub

9条回答
  •  一个人的身影
    2020-12-15 08:22

    A Python loop to look for the shortest match, by brute force testing each substring from left to right, picking the shortest:

    shortest = None
    for i in range(len(string)):
        m = my_regex.match(string[i:])
        if m: 
            mstr = m.group()
            if shortest is None or len(mstr) < len(shortest):
                shortest = mstr
    
    print shortest
    

    Another loop, this time letting re.findall do the hard work of searching for all possible matches, then brute force testing each match right-to-left looking for a shorter substring:

    # find all matches using findall
    matches = my_regex.findall(string)
    
    # for each match, try to match right-hand substrings
    shortest = None
    for m in matches:
        for i in range(-1,-len(m),-1):
            mstr = m[i:]        
            if my_regex.match(mstr):
                break
        else:
            mstr = m
    
        if shortest is None or len(mstr) < len(shortest):
            shortest = mstr
    
    print shortest
    

提交回复
热议问题