R - ggplot2 - difference between ggplot(data, aes(x=variable…)) and ggplot(data, aes(x=data$variable…)) [duplicate]

后端未结

关注

 1  1647

终归单人心

相关标签:

1条回答

迷失自我

2020-12-20 10:58
Using aes(data$variable) inside is never good, never recommended, and should never be used. Sometimes it still works, but aes(variable) always works, so you should always use aes(variable).

More explanation:

ggplot uses nonstandard evaluation. A standard evaluating R function can only see objects in the global environment. If I have data named mydata with a column name col1, and I do mean(col1), I get an error:
```
mydata = data.frame(col1 = 1:3)
mean(col1)
# Error in mean(col1) : object 'col1' not found
```
This error happens because col1 isn't in the global environment. It's just a column name of the mydata data frame.

The aes function does extra work behind the scenes, and knows to look at the columns of the layer's data, in addition to checking the global environment.
```
ggplot(mydata, aes(x = col1)) + geom_bar()
# no error
```
You don't have to use just a column inside aes though. To give flexibility, you can do a function of a column, or even some other vector that you happen to define on the spot (if it has the right length):
```
# these work fine too
ggplot(mydata, aes(x = log(col1))) + geom_bar()
ggplot(mydata, aes(x = c(1, 8, 11)) + geom_bar()
```
So what's the difference between col1 and mydata$col1? Well, col1 is a name of a column, and mydata$col1 is the actual values. ggplot will look for columns in your data named col1, and use that. mydata$col1 is just a vector, it's the full column. The difference matters because ggplot often does data manipulation. Whenever there are facets or aggregate functions, ggplot is splitting your data up into pieces and doing stuff. To do this effectively, it needs to know identify the data and column names. When you give it mydata$col1, you're not giving it a column name, you're just giving it a vector of values - whatever happens to be in that column, and things don't work.

So, just use unquoted column names in aes() without data$ and everything will work as expected.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题