Lately I often had to handle time series data from multiple .csv sources in the same analysis. Let\'s assume for simplicity that all series are regular quarterly series (no
I do this in R all the time. You may find it easier to do in Excel but if your data change, you have to do the same process again. Using R makes it much easier to update and reproduce your results.
Dealing with monthly or quarterly frequencies are made significantly easier with zoo's yearmon
and yearqtr
index classes, respectively. Once you have your data in zoo objects with yearqtr
indexes, all you have to do is merge all the objects.
Here's your sample data:
Lines1 <-
"27.05.11;5965.95
26.05.11;5947.06
25.05.11;5942.82
24.05.11;5939.98"
f1 <- read.csv2(con <- textConnection(Lines1), header=FALSE)
close(con)
Lines2 <-
"Germany;Switzerland;USA;OECDEurope
69,90974;61,8241;55,60966;64,96157
67,0394;62,18966;56,47361;64,15152
70,56651;63,6347;56,87237;65,43568"
f2 <- read.csv2(con <- textConnection(Lines2), header=TRUE)
close(con)
Lines3 <-
"1984-04-01,33.3238396624473
1984-07-01,63.579833082501
1984-10-01,35.8375401560349"
f3 <- read.csv(con <- textConnection(Lines3), header=FALSE)
close(con)
The example below assumes the starting date for the first file is 1984Q2 and the starting date for the second file is 1984Q4. You can see that merge.zoo
takes care of aligning all the dates for you. After everything is aligned in your zoo
object, you can use the as.ts
method to create a mts
object.
z1 <- zoo(f1[,-1], as.Date(f1[,1], "%d.%m.%y"))
z2 <- zoo(f2, as.yearqtr("1984Q4")+(seq_len(NROW(f1))-1)/4)
z3 <- zoo(f3[,-1], as.yearqtr(as.Date(f3[,1])))
library(xts)
# Use xts::apply.quarterly to aggregate series with higher periodicity.
# Here I just take the last obs but you could use another function (e.g. mean).
z1 <- apply.quarterly(z1, last)
index(z1) <- as.yearqtr(index(z1)) # convert the index to yearqtr
(Z <- merge(z1,z2,z3))
# z1 Germany Switzerland USA OECDEurope z3
# 1984 Q2 <NA> <NA> <NA> <NA> <NA> 33.32383
# 1984 Q3 <NA> <NA> <NA> <NA> <NA> 63.57983
# 1984 Q4 <NA> 69.90974 61.8241 55.60966 64.96157 35.83754
# 1985 Q1 <NA> 67.0394 62.18966 56.47361 64.15152 <NA>
# 1985 Q2 <NA> 70.56651 63.6347 56.87237 65.43568 <NA>
# 1985 Q3 <NA> 69.90974 61.8241 55.60966 64.96157 <NA>
# 2011 Q2 5965.95 <NA> <NA> <NA> <NA> <NA>
# Note that ts will create an object with a observation for every period,
# even if all the columns are missing.
TS <- as.ts(Z)
My strategy for problems of this type is:
ts
object, plot it, etc.Using your example data:
v1 <- "27.05.11;5965.95
26.05.11;5947.06
25.05.11;5942.82
24.05.11;5939.98"
v2 <- "Germany;Switzerland;USA;OECDEurope
69,90974;61,8241;55,60966;64,96157
67,0394;62,18966;56,47361;64,15152
70,56651;63,6347;56,87237;65,43568"
v3 <- "1984-04-01,33.3238396624473
1984-07-01,63.579833082501
1984-10-01,35.8375401560349"
# Read and clean data
dat1 <- read.table(textConnection(v1), header=FALSE, sep=";", dec=".")
names(dat1) <- c("date", "V1")
dat1$date <- as.Date(dat1$date, format="%d.%m.%y")
dat1
dat2 <- read.table(textConnection(v2), header=TRUE, sep=";", dec=",")
dat2$date <- seq(as.Date("2011/1/1"), by="3 months", length.out=3)
dat2
dat3 <- read.table(textConnection(v3), header=FALSE, sep=",", dec=".")
names(dat3) <- c("date", "V2")
dat3$date <- as.Date(dat3$date)
dat3
# Merge separate data.frames.
# I use join() in package plyr, you may wish to use merge(), rbind.fill, etc
library(plyr)
join(join(dat1, dat2, type="full"), dat3, type="full")
The results:
date V1 Germany Switzerland USA OECDEurope V2
1 2011-05-27 5965.95 NA NA NA NA NA
2 2011-05-26 5947.06 NA NA NA NA NA
3 2011-05-25 5942.82 NA NA NA NA NA
4 2011-05-24 5939.98 NA NA NA NA NA
5 2011-01-01 NA 69.90974 61.82410 55.60966 64.96157 NA
6 2011-04-01 NA 67.03940 62.18966 56.47361 64.15152 NA
7 2011-07-01 NA 70.56651 63.63470 56.87237 65.43568 NA
8 1984-04-01 NA NA NA NA NA 33.32384
9 1984-07-01 NA NA NA NA NA 63.57983
10 1984-10-01 NA NA NA NA NA 35.83754