Replace patterns that are inside delimiters using a regular expression call

后端 未结 5 1058
轮回少年
轮回少年 2021-01-07 02:13

I need to clip out all the occurances of the pattern \'--\' that are inside single quotes in long string (leaving intact the ones that are outside single quotes). <

5条回答
  •  没有蜡笔的小新
    2021-01-07 02:26

    If bending the rules a little is allowed, this could work:

    import re
    p = re.compile(r"((?:^[^']*')?[^']*?(?:'[^']*'[^']*?)*?)(-{2,})")
    txt = "xxxx rt / $ 'dfdf--fggh-dfgdfg' ghgh- ffffdd -- 'dfdf' ghh-g '--ggh--' vcbcvb"
    print re.sub(p, r'\1-', txt)
    

    Output:

    xxxx rt / $ 'dfdf-fggh-dfgdfg' ghgh- ffffdd -- 'dfdf' ghh-g '-ggh-' vcbcvb
    

    The regex:

    (               # Group 1
      (?:^[^']*')?  # Start of string, up till the first single quote
      [^']*?        # Inside the single quotes, as few characters as possible
      (?:
        '[^']*'     # No double dashes inside theses single quotes, jump to the next.
        [^']*?
      )*?           # as few as possible
    )
    (-{2,})         # The dashes themselves (Group 2)
    

    If there where different delimiters for start and end, you could use something like this:

    -{2,}(?=[^'`]*`)
    

    Edit: I realized that if the string does not contain any quotes, it will match all double dashes in the string. One way of fixing it would be to change

    (?:^[^']*')?
    

    in the beginning to

    (?:^[^']*'|(?!^))
    

    Updated regex:

    ((?:^[^']*'|(?!^))[^']*?(?:'[^']*'[^']*?)*?)(-{2,})
    

提交回复
热议问题