Delete all comments in a file using sed

后端 未结 7 1271
情书的邮戳
情书的邮戳 2020-12-17 00:14

How would you delete all comments using sed from a file(defined with #) with respect to \'#\' being in a string?

This helped out a lot except for the string portion.

7条回答
  •  没有蜡笔的小新
    2020-12-17 01:08

    If # always means comment, and can appear anywhere on a line (like after some code):

    sed 's:#.*$::g' 
    

    If you want to change it in place, add the -i switch:

    sed -i 's:#.*$::g' 
    

    This will delete from any # to the end of the line, ignoring any context. If you use # anywhere where it's not a comment (like in a string), it will delete that too.

    If comments can only start at the beginning of a line, do something like this:

    sed 's:^#.*$::g' 
    

    If they may be preceded by whitespace, but nothing else, do:

    sed 's:^\s*#.*$::g' 
    

    These two will be a little safer because they likely won't delete valid usage of # in your code, such as in strings.

    Edit:

    There's not really a nice way of detecting whether something is in a string. I'd use the last two if that would satisfy the constraints of your language.

    The problem with detecting whether you're in a string is that regular expressions can't do everything. There are a few problems:

    • Strings can likely span lines
    • A regular expression can't tell the difference between apostrophies and single quotes
    • A regular expression can't match nested quotes (these cases will confuse the regex):

      # "hello there"
      # hello there"
      "# hello there"
      

    If double quotes are the only way strings are defined, double quotes will never appear in a comment, and strings cannot span multiple lines, try something like this:

    sed 's:#[^"]*$::g' 
    

    That's a lot of pre-conditions, but if they all hold, you're in business. Otherwise, I'm afraid you're SOL, and you'd be better off writing it in something like Python, where you can do more advanced logic.

提交回复
热议问题