Remove XML comments using Regex in bash

前端 未结 4 1329
清酒与你
清酒与你 2021-01-06 19:40

I want to remove XML comments in bash using regex (awk, sed, grep...) I have looked at other questions about this but they are missing something. Here\'s my xml code

<
4条回答
  •  温柔的废话
    2021-01-06 20:10

    In the end, you're going to have to recommend to your client/friend/instructor that they need to install some kind of XML processor. xmlstarlet is a good command line tool, but there are any number (or at least some number greater than 2) of implementations of XSLT which can be compiled for any standard Unix, and in most cases also for Windows. You really cannot do much XML processing with regex-based tools, and whatever you do will be hard to read, harder to maintain, and likely to fail on corner cases, sometimes with disastrous consequences.

    I haven't spent a lot of time polishing or reviewing the following little awk program. I think it will remove comments from compliant xml documents. Note that the following comment is not compliant:

    
    

    and it will not be treated correctly by my script.

    The following is also illegal, but since I've seen it in the wild and it wasn't hard to deal with, I did so:

    
    

    Here it is. No guarantees. I know that it's hard to read, and I wouldn't want to maintain it. It may well fail on arbitrary corner cases.

    awk 'in_comment&&/-->/{sub(/([^-]|-[^-])*--+>/,"");in_comment=0}
         in_comment{next}
         {gsub(/
         
     
    热议问题