While looking at an answer posted recently on SO, I noticed an unfamiliar assignment statement. Instead of the usual form of myVar<- myValue
, it used the fo
This is an intentional and documented feature. As joran mentioned, the documentation page "Extract" includes this in the "Atomic Vectors" section:
An empty index selects all values: this is most often used to replace all the entries but keep the attributes.
However, in the case of recursive objects (data.frames
or lists
, for example), the attributes are only kept for the subsetted object. Its parts don't get such protection.
Here's an example:
animals <- factor(c('cat', 'dog', 'fish'))
df_factor <- data.frame(x = animals)
rownames(df_factor) <- c('meow', 'bark', 'blub')
str(df_factor)
# 'data.frame': 3 obs. of 1 variable:
# $ x: Factor w/ 3 levels "cat","dog","fish": 1 2 3
df_factor[] <- 'cat'
str(df_factor)
# 'data.frame': 3 obs. of 1 variable:
# $ x: chr "cat" "cat" "cat"
rownames(df_factor)
# [1] "meow" "bark" "blub"
df_factor
kept its rownames
attribute, but the x
column is just the character vector used in the assignment instead of a factor. We can keep the class and levels of x
by specifically replacing its values:
df_factor <- data.frame(x = animals)
df_factor$x[] <- 'cat'
str(df_factor)
# 'data.frame': 3 obs. of 1 variable:
# $ x: Factor w/ 3 levels "cat","dog","fish": 1 1 1
So replacement with empty subsetting is very safe for vectors, matrices, and arrays, because their elements can't have their own attributes. But it requires some care when dealing with list-like objects.