Finding VBA Comments using RegEx

怎甘沉沦 提交于 2019-12-02 07:33:24

Maybe something like

^(?:[^"'\n]*("(?:[^"\n]|"")*"))*[^"]*'(.*)$

It handles multiple quoted strings, as well as strings having quoted (double) "'s (which I believe is VBA's way).

(I guarantee it will fail in some cases, but probably will work in most ;)

Check it out here at regex101.

Edit

Added some of Comintern's examples and adjusted the regex. It still can't handle the bracketed identifiers though (which I don't even know what it means :S See the last line). But it now handles his continued line comments.

^(?:[^"'\n]*(?:"(?:[^"\n]|"")*"))*[^']*('(?:_\n|.)*)

Check it out here at regex101.

You can't find all of the comments (let alone string literals) in VBA code with regular expressions - period. Trust me, I tried during work on the Smart Indenter module of Rubberduck (in case that wasn't explicit enough - full disclosure, I'm a contributor). You'll need to actually parse the code. The first issue that you'll run into are line continuations:

'Comment with a line _
continuation

Debug.Print 'End of line comment _
with line continuation.

Debug.Print 'Multiple line continuation operators _ _
still work.

Debug.Print 'This is actually *not* a line continuation_
Debug.Print 42

This makes it difficult to identify string literals, especially you're using line-by-line processing:

Debug.Print 42 'The next line... _
"...is not a string literal"

You also have to handle the old Rem comment syntax...

Rem old school comment

...which also support line continuations:

Rem old school comment with line _
continuation.

You might be thinking "that can't be so bad, Rem has to start a line". If you are, you forgot about the statement separator (:)...

Debug.Print 42: Rem statement separator comment.

...or its evil twin the statement separator combined with a line continuation:

Debug.Print 42: Rem this can be _
continued too.

You covered a couple of the issues with sorting out string literals and comments like these...

Debug.Print "Unmatched double quotes." 'Comment"
Debug.Print "Interleaved single 'n double quotes." 'Comment"

...but what about bracketed identifiers like this beast (courtesy of @ThunderFrame)?

'No comments or strings in the line below.
Debug.Print [Evil:""Comment"'here] 

Note that the syntax highlighter SO uses doesn't even catch all of these bizarre corner cases.

This should work:

("[^"]+"\s)?'.+

Tested here: https://regex101.com/r/dd60QS/1

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!