How to combine R dataframes based constraints on a time column

不问归期 提交于 2019-12-02 01:46:34

Part 1 - Original Question

The first part of your question can be answered with the sqldf package.

library(sqldf)
df3 <- sqldf("SELECT * FROM df1 a 
             LEFT JOIN df2 b ON a.time < b.time 
             AND a.user = b.user")[,c(1:2, 4)]

#rename to match OP post
names(df3) <- c("user", "time_1", "time_2")

> df3
  user              time_1              time_2
1    1 2016-12-01 08:53:20 2016-12-01 11:50:11
2    1 2016-12-01 12:45:47                <NA>
3    2 2016-12-01 15:34:54                <NA>
4    3 2016-12-01 00:49:50 2016-12-01 01:19:10

Part 2 - Time Window

If you want a window of time to allow for the match, you can subtract seconds within the SQL statement as follows:

df3 <- sqldf("SELECT * FROM df1 a 
             LEFT JOIN df2 b ON a.time < (b.time - 10000)
             AND a.user = b.user")[,c(1:2, 4)]
> df3
  user                time              time.1
1    1 2016-12-01 08:53:20 2016-12-01 11:50:11
2    1 2016-12-01 12:45:47                <NA>
3    2 2016-12-01 15:34:54                <NA>
4    3 2016-12-01 00:49:50                <NA>

Note, whatever you select from b.time will be in seconds.

Here is a data.table solution.

# load data.table and make cast data.frames as data.tables
library(data.table)
setDT(df1)
setDT(df2)

# add time variables, perform join and removing merging time variable
dfDone <- df2[, time2 := time][df1[, time1 := time],
              on=.(user, time > time)][, time:= NULL]

dfDone
   user               time2               time1
1:    1 2016-12-01 11:50:11 2016-12-01 08:53:20
2:    1                <NA> 2016-12-01 12:45:47
3:    2                <NA> 2016-12-01 15:34:54
4:    3 2016-12-01 01:19:10 2016-12-01 00:49:50

If you want to order the columns, you could use setcolorder

setcolorder(dfDone, c("user", "time1", "time2"))

dfDone
   user               time1               time2
1:    1 2016-12-01 08:53:20 2016-12-01 11:50:11
2:    1 2016-12-01 12:45:47                <NA>
3:    2 2016-12-01 15:34:54                <NA>
4:    3 2016-12-01 00:49:50 2016-12-01 01:19:10
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!