问题
I've been trying to transform a dataframe I put together into a Time Series, but for some reason it doesn't work. I am very new to R.
x<-Sales_AEMBG%>%
+ select(Ecriture.DatEcr, Crédit, Mapping)
> names(x)<-c("Dates","Revenue","Mapping")
> str(x)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 15167 obs. of 3 variables:
$ Dates : POSIXct, format: "2016-01-02" "2016-01-02" "2016-01-02" "2016-01-02" ...
$ Revenue: num 124065 214631 135810 225293 57804 ...
$ Mapping: chr "E.M 1.5 L" "E.M 1.5 L" "E.M 1.5 L" "E.M 1.5 L" ...'
When I try to look at the data, here's what I have
> head(x)
# A tibble: 6 x 3
Dates Revenue Mapping
<dttm> <dbl> <chr>
1 2016-01-02 00:00:00 124065. E.M 1.5 L
2 2016-01-02 00:00:00 214631. E.M 1.5 L
3 2016-01-02 00:00:00 135810. E.M 1.5 L
4 2016-01-02 00:00:00 225293. E.M 1.5 L
5 2016-01-02 00:00:00 57804. E.M 1.5 L
6 2016-01-02 00:00:00 124065. E.M 1.5 L
Of course, I tried the as.ts function
> x_xts <- as.ts(x)
Warning message:
In data.matrix(data) : NAs introduced by coercion
> is.ts(x)
[1] FALSE
But it keeps telling me that my dataframe is still not recognized as a TS.
What do you suggest ?
Thanks
回答1:
I've added a few more observations to your data.
# A tibble: 12 x 3
Dates Revenue Mapping
<dttm> <dbl> <chr>
1 2016-01-02 00:00:00 124065 E.M 1.5 L
2 2016-01-02 00:00:00 214631 E.M 1.5 L
3 2016-01-03 00:00:00 135810 E.M 1.5 L
4 2016-01-03 00:00:00 225293 E.M 1.5 L
5 2016-01-05 00:00:00 57804 E.M 1.5 L
6 2016-01-05 00:00:00 124065 E.M 1.5 L
7 2016-01-02 00:00:00 24065 E.M 1.5 M
8 2016-01-02 00:00:00 14631 E.M 1.5 M
9 2016-01-03 00:00:00 35810 E.M 1.5 M
10 2016-01-03 00:00:00 25293 E.M 1.5 M
11 2016-01-05 00:00:00 7804 E.M 1.5 M
12 2016-01-05 00:00:00 24065 E.M 1.5 M
First you need to sum the sales by day (Dates
) and product type (your Mapping
variable?), and pivot into a wider data format:
library(dplyr)
library(tidyr)
x.sum <- x %>%
group_by(Mapping, Dates) %>%
summarise(Revenue=sum(Revenue)) %>%
pivot_wider(id_cols=Dates, names_from=Mapping, values_from=Revenue)
# A tibble: 3 x 3
Dates `E,M 1.5 L` `E,M 1.5 M`
<dttm> <dbl> <dbl>
1 2016-01-02 00:00:00 338696 38696
2 2016-01-03 00:00:00 361103 61103
3 2016-01-05 00:00:00 181869 31869
Note that I've deliberately omitted Jan 4.
If your time series data has missing days, such as stock prices where financial markets are closed on the weekends, then using the as.ts
(or ts
) function won't work. If there are no missing days, then then correct way to convert the data into a time series object ("ts") is to specify the column(s) to convert (x.sum[,2:3]
) and the start (January 2, 2016) and frequency (daily) of the series.
x.ts <- ts(x.sum[,2:3], start=c(2016, 2), frequency=365)
Be careful with the start as the second argument depends on the specified frequency. Here, 365 means daily, so the "2" means day 2 of year 2016. If the frequency was monthly, the "2" would mean month 2 of year 2016.
But as I mentioned, ts
doesn't ignore any missing days. So for this make-up data, if you plotted the time series, then you will get the wrong information.
In this case, other packages such as xts and zoo can be used to simply the work.
library(xts)
x.xts <- xts(x.sum[,2:3], order.by=x.sum$Dates)
plot(x.xts) # Correct results.
Other answers about time series can be found here and here.
来源:https://stackoverflow.com/questions/60483201/transforming-a-dataframe-into-a-ts-in-r