Synchronous X-Axis For Multiple Years of Sales with ggplot

和自甴很熟 提交于 2019-12-20 05:45:36

问题


I have 1417 days of sale data from 2012-01-01 to present (2015-11-20). I can't figure out how to have a single-year (Jan 1 - Dec 31) axis and each year's sales on the same, one year-long window, even when using ggplot's color = as.factor(Year) option.

Total sales are type int

head(df$Total.Sales)
[1] 495 699 911 846 824 949

and I have used the lubridate package to pull Year out of the original Day variable.

df$Day <- as.Date(as.numeric(df$Day), origin="1899-12-30") 
df$Year <- year(df$Day)

But because Day contains the year information

sample(df$Day, 1)
[1] "2012-05-05"

ggplot is still graphing three years instead of synchronizing them to the same period of time (one, full year):

g <- ggplot(df, aes(x = Day, y = Total.Sales, color = as.factor(Year))) +
        geom_line()


回答1:


I create some sample data as follows

set.seed(1234)
dates <- seq(as.Date("2012-01-01"), as.Date("2015-11-20"), by = "1 day")
values <- sample(1:6000, size = length(dates))
data <- data.frame(date = dates, value = values)

Providing something of the sort is, by the way, what is meant by a reproducible example.

Then I prepare some additional columns

library(lubridate)
data$year <- year(data$date)
data$day_of_year <- as.Date(paste("2012",
                    month(data$date),mday(data$date), sep = "-"))

The last line is almost certainly what Roland meant in his comment. And he was right to choose the leap year, because it contains all possible dates. A normal year would miss February 29th.

Now the plot is generated by

library(ggplot2)
library(scales)
g <- ggplot(data, aes(x = day_of_year, y = value, color = as.factor(year))) +
   geom_line() + scale_x_date(labels = date_format("%m/%d"))

I call scale_x_date to define x-axis labels without the year. This relies on the function date_format from the package scales. The string "%m/%d" defines the date format. If you want to know more about these format strings, use ?strptime.

The figure looks as follows:

You can see immediately what might be the trouble with this representation. It is hard to distinguish anything on this plot. But of course this is also related to the fact that my sample data is wildly varying. Your data might look different. Otherwise, consider using faceting (see ?facet_grid or ?facet_wrap).



来源:https://stackoverflow.com/questions/33832776/synchronous-x-axis-for-multiple-years-of-sales-with-ggplot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!