Regular expression: Match everything after a particular word

后端 未结 4 1911
孤街浪徒
孤街浪徒 2020-12-03 06:14

I am using Python and would like to match all the words after test till a period (full-stop) or space is encountered.

text = \"test : match this         


        
4条回答
  •  清歌不尽
    2020-12-03 07:02

    In a general case, as the title mentions, you may capture with (.*) pattern any 0 or more chars other than newline after any pattern(s) you want:

    import re
    p = re.compile(r'test\s*:\s*(.*)')
    s = "test : match this."
    m = p.search(s)           # Run a regex search anywhere inside a string
    if m:                     # If there is a match
        print(m.group(1))     # Print Group 1 value
    

    If you want . to match across multiple lines, compile the regex with re.DOTALL or re.S flag (or add (?s) before the pattern):

    p = re.compile(r'test\s*:\s*(.*)', re.DOTALL)
    p = re.compile(r'(?s)test\s*:\s*(.*)')
    

    However, it will retrun match this.. See also a regex demo.

    You can add \. pattern after (.*) to make the regex engine stop before the last . on that line:

    test\s*:\s*(.*)\.
    

    Watch out for re.match() since it will only look for a match at the beginning of the string (Avinash aleady pointed that out, but it is a very important note!)

    See the regex demo and a sample Python code snippet:

    import re
    p = re.compile(r'test\s*:\s*(.*)\.')
    s = "test : match this."
    m = p.search(s)           # Run a regex search anywhere inside a string
    if m:                     # If there is a match
        print(m.group(1))     # Print Group 1 value
    

    If you want to make sure test is matched as a whole word, add \b before it (do not remove the r prefix from the string literal, or '\b' will match a BACKSPACE char!) - r'\btest\s*:\s*(.*)\.'.

提交回复
热议问题