how to do re.compile() with a list in python

前端 未结 5 578
日久生厌
日久生厌 2020-12-04 20:08

I have a list of strings in which I want to filter for strings that contains keywords.

I want to do something like:

fruit = re.compile(\'apple\', \'         


        
相关标签:
5条回答
  • 2020-12-04 20:43

    You can create one regular expression, which will match, when any of the terms is found:

    >>> s, t = "A kiwi, please.", "Strawberry anyone?"
    >>> import re
    >>> pattern = re.compile('apple|banana|peach|plum|pineapple|kiwi', re.IGNORECASE)
    >>> pattern.search(s)
    <_sre.SRE_Match object at 0x10046d4a8>
    >>> pattern.search(t) # won't find anything
    
    0 讨论(0)
  • 2020-12-04 20:59

    Pyhton 3.x Update:

    fruit_list = ['apple', 'banana', 'peach', 'plum', 'pineapple', 'kiwi']
    fruit = re.compile(r'\b(?:{0})\b'.format('|'.join(fruit_list))
    
    0 讨论(0)
  • 2020-12-04 21:01

    You need to turn your fruit list into the string apple|banana|peach|plum|pineapple|kiwi so that it is a valid regex, the following should do this for you:

    fruit_list = ['apple', 'banana', 'peach', 'plum', 'pineapple', 'kiwi']
    fruit = re.compile('|'.join(fruit_list))
    

    edit: As ridgerunner pointed out in comments, you will probably want to add word boundaries to the regex, otherwise the regex will match on words like plump since they have a fruit as a substring.

    fruit = re.compile(r'\b(?:%s)\b' % '|'.join(fruit_list))
    
    0 讨论(0)
  • 2020-12-04 21:01

    As you want exact matches, no real need for regex imo...

    fruits = ['apple', 'cherry']
    sentences = ['green apple', 'yellow car', 'red cherry']
    for s in sentences:
        if any(f in s for f in fruits):
            print s, 'contains a fruit!'
    # green apple contains a fruit!
    # red cherry contains a fruit!
    

    EDIT: If you need access to the strings that matched:

    from itertools import compress
    
    fruits = ['apple', 'banana', 'cherry']
    s = 'green apple and red cherry'
    
    list(compress(fruits, (f in s for f in fruits)))
    # ['apple', 'cherry']
    
    0 讨论(0)
  • 2020-12-04 21:01

    Code:

    fruits =  ['apple', 'banana', 'peach', 'plum', 'pinepple', 'kiwi'] 
    fruit_re = [re.compile(fruit) for fruit in fruits]
    fruit_test = lambda x: any([pattern.search(x) for pattern in fruit_re])
    

    Example usage:

    fruits_veggies = ['this is an apple', 'this is a tomato']
    return [fruit_test(str) for str in fruits_veggies]
    

    Edit: I realized Andrew's solution is better. You could improve fruit_test with Andrew's regular expression as

    fruit_test = lambda x: andrew_re.search(x) is None
    
    0 讨论(0)
提交回复
热议问题