Several months ago I asked something similar, but I was using JavaScript to check if provided string is a \"valid\" R object name. Now I\'d like to achieve the same by using
As Josh suggests, make.names
is probably the best solution to this. Not only will it handle weird punctuation, it'll also flag reserved words:
make.names(".x") # ".x"
make.names("_x") # "X_x"
make.names("if") # " if."
make.names("function") # "function."
Edited 2013-1-9 to fix regular expression. Previous regular expression, lifted from page 456 of John Chambers' "Software for Data Analysis", was (subtly) incomplete. (h.t. Hadley Wickham)
There are a couple of issues here. A simple regular expression can be used to identify all syntactically valid names --- but some of those names (like if
and while
) are 'reserved', and cannot be assigned to.
Identifying syntactically valid names:
?make.names explains that a syntactically valid name:
[...] consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as '".2way"' are not valid [...]
Here is the corresponding regular expression:
"^((([[:alpha:]]|[.][._[:alpha:]])[._[:alnum:]]*)|[.])$"
Identifying unreserved syntactically valid names
To identify unreserved names, you can take advantage of the base function make.names()
, which constructs syntactically valid names from arbitrary character strings.
isValidAndUnreserved <- function(string) {
make.names(string) == string
}
isValidAndUnreserved(".jjj")
# [1] TRUE
isValidAndUnreserved(" jjj")
# [1] FALSE
Putting it all together
isValidName <- function(string) {
grepl("^((([[:alpha:]]|[.][._[:alpha:]])[._[:alnum:]]*)|[.])$", string)
}
isValidAndUnreservedName <- function(string) {
make.names(string) == string
}
testValidity <- function(string) {
valid <- isValidName(string)
unreserved <- isValidAndUnreservedName(string)
reserved <- (valid & ! unreserved)
list("Valid"=valid,
"Unreserved"=unreserved,
"Reserved"=reserved)
}
testNames <- c("mean", ".j_j", ".", "...", "if", "while", "TRUE", "NULL",
"_jj", " j", ".2way")
t(sapply(testNames, testValidity))
Valid Unreserved Reserved
mean TRUE TRUE FALSE
.j_j TRUE TRUE FALSE
. TRUE TRUE FALSE
... TRUE TRUE FALSE
if TRUE FALSE TRUE
while TRUE FALSE TRUE
TRUE TRUE FALSE TRUE
NULL TRUE FALSE TRUE
_jj FALSE FALSE FALSE
j FALSE FALSE FALSE # Note: these tests are for " j", not "j"
.2way FALSE FALSE FALSE
For more discussion of these issues, see the r-devel thread linked to by @Hadley in the comments below.