gsub error turning upper to lower case in R

一曲冷凌霜 提交于 2019-12-10 13:47:52

问题


I would like to recode some identifiers, from upper case to lower case.

I am not sure what the issue is here.

n = c('AFD.434', 'BSD.23', 'F234.FF')
gsub(pattern = '[[:upper:]]', replacement = '[[:lower:]]', n)

[1] "[[:lower:]][[:lower:]][[:lower:]].434" "[[:lower:]][[:lower:]][[:lower:]].23"  "[[:lower:]]234.[[:lower:]][[:lower:]]"

Any advice?


回答1:


Your gsub call replaces each occurrence with the literal string "[[:lower:]]".

The simplest solution is to not use regular expressions; simply use tolower() (as already mentioned in the comments / other answers).

One possible approach with regular expressions is the usage of Perl extended mode and the \L modifier to convert to lowercase:

gsub(pattern = '([[:upper:]])', perl = TRUE, replacement = '\\L\\1', n)

This approach

  • uses a capturing group (...) to "remember" the match
  • uses a backreference \1 to refer to the match in the replacement string
  • uses the \L modifier to convert the match to lowercase

See the online help for gsub for further details.




回答2:


The gsub function takes a regular expression as the first argument, and a replacement string as a second one that cannot have special characters/classes that are used in a regular expression. They are rather treated as literals. Thus, you are trying to replace each uppercase letter with a literal string [[:lower:]] (see my demo).

To turn the values of your data frame to lower case, you must use tolower() (as already have been pointed out):

n = c('AFD.434', 'BSD.23', 'F234.FF')
tolower(n)

See demo

Output:

[1] "afd.434" "bsd.23"  "f234.ff"

Note that Franks's suggestion to use Perl \L operator is handy when we need to capitalize specific parts of the string, but in your case it seems to be an unnecessary overhead.



来源:https://stackoverflow.com/questions/30664444/gsub-error-turning-upper-to-lower-case-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!