Regex to match a C-style multiline comment

前端 未结 7 2219
忘了有多久
忘了有多久 2020-11-22 10:04

I have a string for e.g.

String src = \"How are things today /* this is comment *\\*/ and is your code  /*\\* this is another comment */ working?\"
         


        
7条回答
  •  迷失自我
    2020-11-22 10:47

    The best multiline comment regex is an unrolled version of (?s)/\*.*?\*/ that looks like

    String pat = "/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/";
    

    See the regex demo and explanation at regex101.com.

    In short,

    • /\* - match the comment start /*
    • [^*]*\*+ - match 0+ characters other than * followed with 1+ literal *
    • (?:[^/*][^*]*\*+)* - 0+ sequences of:
      • [^/*][^*]*\*+ - not a / or * (matched with [^/*]) followed with 0+ non-asterisk characters ([^*]*) followed with 1+ asterisks (\*+)
    • / - closing /

    David's regex needs 26 steps to find the match in my example string, and my regex needs just 12 steps. With huge inputs, David's regex is likely to fail with a stack overflow issue or something similar because the .*? lazy dot matching is inefficient due to lazy pattern expansion at each location the regex engine performs, while my pattern matches linear chunks of text in one go.

提交回复
热议问题