I have some intermittent demand data that only includes lines where demand is present. I bring it in via read.csv, and my 2 columns are Date (as date) and Quantity (as integ
You can use merge
to add the missing rows and then set their values to zero.
First, let's create some fake data:
# Vector of dates from Jan 1, 2015, to Mar 31, 2015
dates = seq(as.Date("2015-01-01"), as.Date("2015-03-31"), by="1 day")
# Let's create data for few of these dates, leaving some out
set.seed(55)
dat = data.frame(dates=dates[sample(1:length(dates), 70)],
quantity=sample(1:10, 70, replace=TRUE))
dat = dat[order(dat$dates),]
Now let's make believe dat
is what you imported from a csv file. We want to fill in quantity
=0 for the missing dates. So first we need to add rows for the missing dates. You can do this by creating a date vector containing all dates from the first date to the last date in your csv file and using the merge
function. In this case, we've already created that date vector above.
Now merge in rows for the missing dates. The new rows will have NA
for quantity
. We'll change those NA
s to zero below.
dat = merge(data.frame(dates), dat, by="dates", all.x=TRUE)
# Set missing values to zero
dat$quantity[is.na(dat$quantity)] = 0
Now you can aggregate by month, convert to a zoo
series, etc.
Because "df" is a zoo
object, you may use merge.zoo
and its fill
argument. The current data set is merged with an empty zoo
object which contains all the desired dates.
tt <- seq(min(Date), max(Date), "month")
merge(df, zoo(, tt), fill = 0)
# 2013-01-01 2013-02-01 2013-03-01 2013-04-01 2013-05-01 2013-06-01 2013-07-01 2013-08-01 2013-09-01 2013-10-01 2013-11-01
# 2 0 0 0 0 0 0 0 0 3 6
For further examples, see ?merge.zoo
("extend an irregular series to a regular one").