问题
Several months ago I asked something similar, but I was using JavaScript to check if provided string is a "valid" R object name. Now I'd like to achieve the same by using nothing but R. I suppose that there's a very nice way to do this, with some neat (not so) esoteric R function, so regular expressions seem to me as the last line of defence. Any ideas?
Oh, yeah, using back-ticks and stuff is considered cheating. =)
回答1:
Edited 2013-1-9 to fix regular expression. Previous regular expression, lifted from page 456 of John Chambers' "Software for Data Analysis", was (subtly) incomplete. (h.t. Hadley Wickham)
There are a couple of issues here. A simple regular expression can be used to identify all syntactically valid names --- but some of those names (like if
and while
) are 'reserved', and cannot be assigned to.
- Identifying syntactically valid names:
?make.names explains that a syntactically valid name:
[...] consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as '".2way"' are not valid [...]
Here is the corresponding regular expression:
"^([[:alpha:]]|[.][._[:alpha:]])[._[:alnum:]]*$"
- Identifying unreserved syntactically valid names
To identify unreserved names, you can take advantage of the base function make.names()
, which constructs syntactically valid names from arbitrary character strings.
isValidAndUnreserved <- function(string) {
make.names(string) == string
}
isValidAndUnreserved(".jjj")
# [1] TRUE
isValidAndUnreserved(" jjj")
# [1] FALSE
Putting it all together
isValidName <- function(string) { grepl("^([[:alpha:]]|[.][._[:alpha:]])[._[:alnum:]]*$", string) } isValidAndUnreservedName <- function(string) { make.names(string) == string } testValidity <- function(string) { valid <- isValidName(string) unreserved <- isValidAndUnreservedName(string) reserved <- (valid & ! unreserved) list("Valid"=valid, "Unreserved"=unreserved, "Reserved"=reserved) } testNames <- c("mean", ".j_j", "...", "if", "while", "TRUE", "NULL", "_jj", " j", ".2way") t(sapply(testNames, testValidity)) Valid Unreserved Reserved mean TRUE TRUE FALSE .j_j TRUE TRUE FALSE ... TRUE TRUE FALSE if TRUE FALSE TRUE while TRUE FALSE TRUE TRUE TRUE FALSE TRUE NULL TRUE FALSE TRUE _jj FALSE FALSE FALSE j FALSE FALSE FALSE # Note: these tests are for " j", not "j" .2way FALSE FALSE FALSE
For more discussion of these issues, see the r-devel thread linked to by @Hadley in the comments below.
回答2:
As Josh suggests, make.names
is probably the best solution to this. Not only will it handle weird punctuation, it'll also flag reserved words:
make.names(".x") # ".x"
make.names("_x") # "X_x"
make.names("if") # " if."
make.names("function") # "function."
来源:https://stackoverflow.com/questions/8396577/check-if-character-value-is-a-valid-r-object-name