How to concisely cascade through multiple regex statements in Python

前端 未结 7 836
天涯浪人
天涯浪人 2020-12-08 05:51

My dilemma: I\'m passing my function a string that I need to then perform numerous regex manipulations on. The logic is if there\'s a match in the first regex, do one thing

相关标签:
7条回答
  • 2020-12-08 05:59

    Hmm... you could use something with the with construct... um

    class rewrapper()
        def __init__(self, pattern, target):
            something
    
        def __enter__(self):
            something
    
        def __exit__(self):
            something
    
    
     with rewrapper("regex1", string) as match:
        etc
    
     with rewrapper("regex2", string) as match: 
        and so forth
    
    0 讨论(0)
  • 2020-12-08 06:01

    Similar question from back in september: How do you translate this regular-expression idiom from Perl into Python?

    Using global variables in a module maybe not the best way to do it, but converting it into a class:

    import re
    
    class Re(object):
      def __init__(self):
        self.last_match = None
      def match(self,pattern,text):
        self.last_match = re.match(pattern,text)
        return self.last_match
      def search(self,pattern,text):
        self.last_match = re.search(pattern,text)
        return self.last_match
    
    gre = Re()
    if gre.match(r'foo',text):
      # do something with gre.last_match
    elif gre.match(r'bar',text):
      # do something with gre.last_match
    else:
      # do something else
    
    0 讨论(0)
  • 2020-12-08 06:06

    Are the manipulations for each regex similar? If so, try this:

    for regex in ('regex1', 'regex2', 'regex3', 'regex4'):
        match = re.match(regex, string)
        if match:
            # Manipulate match.group(n)
            return result
    
    0 讨论(0)
  • 2020-12-08 06:12
    class RegexStore(object):
       _searches = None
    
       def __init__(self, pat_list):
          # build RegEx searches
          self._searches = [(name,re.compile(pat, re.VERBOSE)) for
                            name,pat in pat_list]
    
       def match( self, text ):
          match_all = ((x,y.match(text)) for x,y in self._searches)
          try:
             return ifilter(op.itemgetter(1), match_all).next()
          except StopIteration, e:
             # instead of 'name', in first arg, return bad 'text' line
             return (text,None)
    

    You can use this class like so:

    rs = RegexStore( (('pat1', r'.*STRING1.*'),
                      ('pat2', r'.*STRING2.*')) )
    name,match = rs.match( "MY SAMPLE STRING1" )
    
    if name == 'pat1':
       print 'found pat1'
    elif name == 'pat2':
       print 'found pat2'
    
    0 讨论(0)
  • 2020-12-08 06:13

    Generally speaking, in these sorts of situations, you want to make the code "data driven". That is, put the important information in a container, and loop through it.

    In your case, the important information is (string, function) pairs.

    import re
    
    def fun1():
        print('fun1')
    
    def fun2():
        print('fun2')
    
    def fun3():
        print('fun3')
    
    regex_handlers = [
        (r'regex1', fun1),
        (r'regex2', fun2),
        (r'regex3', fun3)
        ]
    
    def example(string):
        for regex, fun in regex_handlers:
            if re.match(regex, string):
                fun()  # call the function
                break
    
    example('regex2')
    
    0 讨论(0)
  • 2020-12-08 06:13

    I had the same problem as yours. Here´s my solution:

    import re
    
    regexp = {
        'key1': re.compile(r'regexp1'),
        'key2': re.compile(r'regexp2'),
        'key3': re.compile(r'regexp3'),
        # ...
    }
    
    def test_all_regexp(string):
        for key, pattern in regexp.items():
            m = pattern.match(string)
            if m:
                # do what you want
                break
    

    It´s a slightly modified solution from the answer of Extracting info from large structured text files

    0 讨论(0)
提交回复
热议问题