python regex to remove comments

后端 未结 3 2024
野趣味
野趣味 2020-12-06 15:27

How would I write a regex that removes all comments that start with the # and stop at the end of the line -- but at the same time exclude the first two lines which say

3条回答
  •  独厮守ぢ
    2020-12-06 16:03

    You can remove comments by parsing the Python code with tokenize.generate_tokens. The following is a slightly modified version of this example from the docs:

    import tokenize
    import io
    import sys
    if sys.version_info[0] == 3:
        StringIO = io.StringIO
    else:
        StringIO = io.BytesIO
    
    def nocomment(s):
        result = []
        g = tokenize.generate_tokens(StringIO(s).readline)  
        for toknum, tokval, _, _, _  in g:
            # print(toknum,tokval)
            if toknum != tokenize.COMMENT:
                result.append((toknum, tokval))
        return tokenize.untokenize(result)
    
    with open('script.py','r') as f:
        content=f.read()
    
    print(nocomment(content))
    

    For example:

    If script.py contains

    def foo(): # Remove this comment
        ''' But do not remove this #1 docstring 
        '''
        # Another comment
        pass
    

    then the output of nocomment is

    def foo ():
        ''' But do not remove this #1 docstring 
        '''
    
        pass 
    

提交回复
热议问题