Vera ++ TCL rule : list all local variables [closed]

浪子不回头ぞ 提交于 2019-12-01 22:41:26
πάντα ῥεῖ

It's pretty complicated to parse for local (or any other) variable definitions using vera++ rules, but doable of course. The basic C++ parsing and tokenizing is done by vera++.

The basic approach is to use vera++'s getTokens function in conjunction with a little state machine that checks for completed C++ statements. You need to gather tokens (and may be their values additionally, since you'll need the variable names later to setup the list) and concatenate them until you have a complete statement. If you have a complete statement you can use a regular expression to check if it's a variable defintion and extract the variable name from a submatch. Also you need to remember if you're inside a {} block to know if it's a local variable definition.

You can find a sample for building a simple statemachine to gather the tokens to statements in vera++'s rule T019 that checks for complete curly braced blocks of code, to take as a starting point.

I've done parsing for variable defintions with vera++ (to check for various naming conventions), but unfortunately can't post the complete code since it's proprietary work for my employer. But I can give you a snippet showing the regular expression I'm using to check for variable declarations:

set isVar false
if [regexp {\s+((extern\s+)?(static\s+|mutable\s+|register\s+|volatile\s+)?(const\s+)?)?((identifier#[^#]+#\s+colon_colon\s+)*identifier#[^#]+#)\s+(star\s+|const\s+|and\s+|less.*greater\s+|greater\s+)*(identifier#[^#]+#\s+colon_colon\s+)*identifier#([^#]+)#(\s+leftbracket.*rightbracket)?(\s+assign)?.*semicolon$} $statement m s1 s2 s3 s4 s5 s6 s7 s8 s9 s10] {
    set locVarname $s9
    set isVar true
    set currentMatch $m
} elseif [regexp {\s+((extern\s+)?(static\s+|mutable\s+|register\s+|volatile\s+)?(const\s+)?)?(char\s+|int\s+|short\s+|long\s+|void\s+|bool\s+|double\s+|float\s+|unsigned\s+|and\s+|star\s+|unsigned\s+)+(identifier#[^#]+#\s+colon_colon)*\s+identifier#([^#]+)#(\s+leftbracket.*rightbracket)?(\s+assign)?.*semicolon$} $statement m s1 s2 s3 s4 s5 s6 s7 s8] {
    set locVarname $s7
    set isVar true
    set currentMatch $m
}

$statement contains the complete statement as mentioned before. Note that I'm concatenating the token value to the identifier token using identifier#<value># and use a regex group to extract it.

Unfortunately, I think you're grossly underestimating the complexity of the task. The problem is that you can't do any guesses (however educated) about the contents of a C++ file unless you have really parsed it as defined in the C++ standard, and doing this is abysmally hard.

By now it should be obvious that the question of what programming language is used to implement such parsing is really not that important. You surely can implement this in Tcl, but then the question is not specific as properly answering it in its current form would actually amount to posting the ready-made parser code. Hence I voted to close your question as non constructive, hope you'll understand.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!