Regex for managing escaped characters for items like string literals

后端 未结 6 886
别跟我提以往
别跟我提以往 2020-12-17 22:21

I would like to be able to match a string literal with the option of escaped quotations. For instance, I\'d like to be able to search \"this is a \'test with escaped\\\' val

6条回答
  •  半阙折子戏
    2020-12-17 22:53

    I think this will work:

    import re
    regexc = re.compile(r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'")
    
    def check(test, base, target):
        match = regexc.search(base)
        assert match is not None, test+": regex didn't match for "+base
        assert match.group(1) == target, test+": "+target+" not found in "+base
        print "test %s passed"%test
    
    check("Empty","''","")
    check("single escape1", r""" Example: 'Foo \' Bar'  End. """,r"Foo \' Bar")
    check("single escape2", r"""'\''""",r"\'")
    check("double escape",r""" Example2: 'Foo \\' End. """,r"Foo \\")
    check("First quote escaped",r"not matched\''a'","a")
    check("First quote escaped beginning",r"\''a'","a")
    

    The regular expression r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'" is forward matching only the things that we want inside the string:

    1. Chars that aren't backslash or quote.
    2. Escaped quote
    3. Escaped backslash

    EDIT:

    Add extra regex at front to check for first quote escaped.

提交回复
热议问题