Jumping to the next available date when merging panels

谁说我不能喝 提交于 2019-12-13 09:32:28

问题


I asked a very similar question just before. The difference is, before it was a situation with only one company. Now it's a panel data structure. I want to merge two data.frames by date. Data contains stock data for each trading day of a stock. Events contains news about the company. Some news were published on non-trading days, so there are no stock data for this day. For example on 04.01.2000 some news for company "A" were published. I want to merge this article with the return of the NEXT trading day, in this case the return on the 06.01.2000. So how can I jump to the next trading day when merging in a panel situation?

date1 <- c("01.01.2000","02.01.2000","03.01.2000","06.01.2000","07.01.2000","09.01.2000","01.01.2000","02.01.2000","03.01.2000","06.01.2000","07.01.2000","09.01.2000")
ret1 <- c(-2.0,1.1,3,1.4,-0.2, 0.6, 0.1, -0.21, -1.2, 0.9, 0.3, -0.1)
company1 <- c("A","A","A","A","A","A","B","B","B","B","B","B")
df <- data.frame(date1, ret1, company1)
df

#         date1  ret1 company1
# 1  01.01.2000 -2.00        A
# 2  02.01.2000  1.10        A
# 3  03.01.2000  3.00        A
# 4  06.01.2000  1.40        A
# 5  07.01.2000 -0.20        A
# 6  09.01.2000  0.60        A
# 7  01.01.2000  0.10        B
# 8  02.01.2000 -0.21        B
# 9  03.01.2000 -1.20        B
# 10 06.01.2000  0.90        B
# 11 07.01.2000  0.30        B
# 12 09.01.2000 -0.10        B

date2 <- c("02.01.2000","03.01.2000","04.01.2000","08.01.2000","05.01.2000","08.01.2000","09.01.2000")
news2 <- c("blabla11", "blabla12","blabla13","blabla14","blabla21","blabla22","blabla23")
company2 <- c("A","A","A","A","B","B","B")

event <- data.frame(date2, news2, company2)
event 

#        date2    news2 company2
# 1 02.01.2000 blabla11        A
# 2 03.01.2000 blabla12        A
# 3 04.01.2000 blabla13        A
# 4 08.01.2000 blabla14        A
# 5 05.01.2000 blabla21        B
# 6 08.01.2000 blabla22        B
# 7 09.01.2000 blabla23        B

the output should look like this:

#        date2    news2 company2 date1        ret
# 1 02.01.2000 blabla11        A 02.01.2000  1.10
# 2 03.01.2000 blabla12        A 03.01.2000  3.00
# 3 04.01.2000 blabla13        A 06.01.2000  1.40
# 4 08.01.2000 blabla14        A 09.01.2000  0.60
# 5 05.01.2000 blabla21        B 06.01.2000  0.90
# 6 08.01.2000 blabla22        B 09.01.2000 -0.10
# 7 09.01.2000 blabla23        B 09.01.2000 -0.10

回答1:


It's more or less the same as your other question. The only change is that you'll have to set "company,date" as the key columns to perform the join on (note that the order is important - it'll first sort by company and then by date).

require(data.table) ## 1.9.2
setDT(df)
setDT(event)
setkey(df, company1, date1)
setkey(event, company2, date2)
df[, date := date1]
df[event, roll=-Inf]

   company1      date1 ret1       date    news2
1:        A 02.01.2000  1.1 02.01.2000 blabla11
2:        A 03.01.2000  3.0 03.01.2000 blabla12
3:        A 04.01.2000  1.4 06.01.2000 blabla13
4:        A 08.01.2000  0.6 09.01.2000 blabla14
5:        B 05.01.2000  0.9 06.01.2000 blabla21
6:        B 08.01.2000 -0.1 09.01.2000 blabla22
7:        B 09.01.2000 -0.1 09.01.2000 blabla23


来源:https://stackoverflow.com/questions/23309515/jumping-to-the-next-available-date-when-merging-panels

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!