ctags and R regex

自闭症网瘾萝莉.ら 提交于 2019-12-07 03:39:25

Those patterns don't appear to be correct, because a variable is only identified if it is assigned a value that has the same length as "function" but does not share any of its characters. E.g.:

x <- aaaaaaaa

The following ctags configuration should work properly:

--langdef=R
--langmap=R:.R.r
--regex-R=/^[ \t]*"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t]function[ \t]*\(/\1/f,Functions/
--regex-R=/^"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t][^\(]+$/\1/g,GlobalVars/
--regex-R=/[ \t]"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t][^\(]+$/\1/v,FunctionVariables/

The idea here is that a function must be followed by a parenthesis while the variable declarations do not. Given the limitations of a regex parser this parenthesis must appear in the same line as the keyword "function" for this configuration to work.

Update: R support will be included in Universal Ctags, so give it a try and report bugs and missing features.

The problem with the original regex is that it does not capture variable declarations / assignments that are shorter than 8 characters (since the regex requires 8 characters that are NOT f-u-n-c-t-i-o-n in that specific order).

The problem with the regex updated by Vitor is that, it does not capture variable declarations / assignments which incorporate parantheses, a situation which is quite common in R.

And a forgotton issue is that, neither of them captures superassigned objects with <<- (only local assignment with <- is covered).

As I checked the Universal Ctags repo, although pcre regex is planned to be supported as also raised in Issue 519 and a commented out pcre flag exists in the configuration file, there is not yet support for pcre type positive/negative lookahead or lookbehind expressions in ctags unfortunately. When that support starts, things will be much easier.

My solution, first of all, takes into account superassignment by "<{1,2}-" and also the fact that the right side of assignment can include:

  • either a sequence of 8 characters which are not f-u-n-c-t-i-o-n followed by any or no characters (.*)
  • OR a sequence of at most 7 any characters (.{1,7})

My proposed regex patterns are as such:

--langdef=R
--langmap=R:.R.r
--regex-R=/^[ \t]*"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t]function[ \t]*\(/\1/f,Functions/
--regex-R=/^"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<{1,2}-[ \t]([^f][^u][^n][^c][^t][^i][^o][^n].*|.{1,7}$)/\1/g,GlobalVars/ 
--regex-R=/[ \t]"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<{1,2}-[ \t]([^f][^u][^n][^c][^t][^i][^o][^n].*|.{1,7}$)/\1/v,FunctionVariables/

And my tests show that, it captures much more objects than the previous ones.

Edit: Even these do not capture the variables that are declared as arguments to functions. Since ctags regex makes a non-greedy search and non-capturing groups do not work, the result is still limited. But we can at least capture the first arguments to functions and define them as an additional type as a, Arguments:

--regex-R=/.+function[ ]*\([ \t]*(,*([^,= \(\)]+)[,= ]*)/\2/a,Arguments/
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!