Why are names(x)<-y and “names<-”(x,y) not equivalent?

有些话、适合烂在心里 提交于 2021-01-27 21:30:00

问题


Consider the following:

y<-c("A","B","C")  
x<-z<-c(1,2,3)  
names(x)<-y
"names<-"(z,y)

If you run this code, you will discover that names(x)<-y is not identical to "names<-"(z,y). In particular, one sees that names(x)<-y actually changes the names of x whereas "names<-"(z,y) returns z with its names changed.

Why is this? I was under the impression that the difference between writing a function normally and writing it as an infix operator was only one of syntax, rather than something that actually changes the output. Where in the documentation is this difference discussed?


回答1:


Short answer: names(x)<-y is actually sugar for x<-"names<-"(x,y) and not just "names<-"(x,y). See the the R-lang manual, pages 18-19 (pages 23-24 of the PDF), which comes to basically the same example.

For example, names(x) <- c("a","b") is equivalent to:

`*tmp*`<-x
x <- "names<-"(`*tmp*`, value=c("a","b"))
rm(`*tmp*`)

If more familiar with getter/setter, one can think that if somefunction is a getter function, somefunction<- is the corresponding setter. In R, where each object is immutable, it's more correct to call the setter a replacement function, because the function actually creates a new object identical to the old one, but with an attribute added/modified/removed and replaces with this new object the old one.

In the case example for instance, the names attribute are not just added to x; rather a new object with the same values of x but with the names is created and linked to the x symbol.

Since there are still some doubts about why the issue is discussed in the language doc instead directly on ?names, here is a small recap of this property of the R language.

  • You can define a function with the name you wish (there are some restrictions of course) and the name does not impact in any way if the function is called "normally".
  • However, if you name a function with the <- suffix, it becomes a replacement function and allows the parser to apply the function with the mechanism described at the beginning of this answer if called by the syntax foo(x)<-value. See here that you don't call explicitely foo<-, but with a slightly different syntax you obtain an object replacement (since the name).
  • Although there are not formal restrictions, it's common to define getter/setter in R with the same name (for instance names and names<-). In this case, the <- suffix function is the replacement function of the corresponding version without suffix.
  • As stated at the beginning, this behaviour is general and a property of the language, so it doesn't need to be discussed in any replacement function doc.



回答2:


In particular, one sees that names(x)<-y actually changes the names of x whereas "names<-"(z,y) returns z with its names changed.

That’s because `names<-`1 is a regular function, albeit with an odd name2. It performs no assignment, it returns a new object with the names attribute set. In fact `names<-` is a primitive function in R but it could be implemented as follows (there are shorter, better ways of writing this in R, but I want the separate steps to be explicit):

`names<-` = function (x, value) {
    new = x
    attr(new, 'names') = value
    new
}

That is, it

  • … creates a new object that’s a copy of x,
  • … sets the names attribute on that newly created object, and
  • … returns the new object.

Since virtually all objects in R are immutable, this fits naturally into R’s semantics. In fact, a better name for this exact function would be with_names3. But the creators of R found it convenient to be able to write such an assignment without repeating the name of the object. So instead of writing

x = with_names(x, c('foo', 'bar'))

or

x = `names<-`(x, c('foo', 'bar'))

R allows us to write

names(x) = c('foo', 'bar')

R handles this syntax specially by internally converting it to another expression, documented in the Subset assignment section of the R language definition, as explained in the answer by Nicola.

But the gist is that names(x) = y and `names<-`(x, y) are different because … they just are. The former is a special syntactic form that gets recognised and transformed by the R parser. The latter is a regular function call, and the weird function name is a red herring: it doesn’t affect the execution whatsoever. It does the same as if the function was named differently, and you can confirm this by assigning it a different name:

with_names = `names<-`
`another weird(!) name` = `names<-`

# These are all identical:

`names<-`(x, y)
with_names(x, y)
`another weird(!) name`(x, y)

1 I strongly encourage using backtick quotes (`) instead of straight quotes (' or ") to quote R variable names. While both are allowed in some circumstances, the latter invites confusion with strings, and is conceptually bonkers. These are not strings. Consider:

"a" = "b"
"c" = "a"

Rather than copy the value of a into c, what this code actually does is set c to literal "a", because quotes now mean different things on the left- and right-hand side of assignment.

The R documentation confirms that

The preferred quote [for variable names] is the backtick (`)

2 Regular variable names (aka “identifiers” or just “names”) in R can only contain letters, digits, underscore and the dot, must start with a letter, or with a dot not followed by a digit, and can’t be reserved words. But R allows using pretty much arbitrary characters — including punctuation and even spaces! — in variable names, provided the name is backtick-quoted.

3 In fact, R has an almost-alias for this function, called setNames — which isn’t a great name, since set… implies mutating the object, but of course it doesn’t do that.



来源:https://stackoverflow.com/questions/65395373/why-are-namesx-y-and-names-x-y-not-equivalent

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!