Create new column based on condition that exists within a rolling date

℡╲_俬逩灬. 提交于 2019-11-30 10:21:29

Assuming that I understood it right, here's a data.table way using foverlaps() function.

Create dt and set key as shown below:

dt <- data.table(player_id = p, games = g, date = d, end_date = d)
setkey(dt, player_id, date, end_date)

hybrid_index <- function(dt, roll_days) {
    ivals = copy(dt)[, date := date-roll_days]
    olaps = foverlaps(ivals, dt, type="any", which=TRUE)
    olaps[, val := dt$games[xid] != dt$games[yid]]
    olaps[, any(val), by=xid][(V1), xid]
}

We create a dummy data.table ivals (for intervals), and for each row, we specify the start and the end dates. Note that by specifying end_date identical as dt$end_date, we'll definitely have one match (and this is deliberate) - this'll give you the non-NA version you ask for.

[With some minor changes here, you can get the NA version, but I'll leave that to you (assuming this answer is right).]

With that we simply find which ranges from ivals overlaps with dt, for each player_id. We get the matching indices. From there it's straightforward. If a player's game is non-homogeneous, then we return the corresponding index of dt from hybrid_index. And we replace those indices with "hybrid".

# roll days = 1L
dt[, type := games][hybrid_index(dt, 1L), type := "hybrid"]
#    player_id games       date   end_date   type
# 1:         1     A 2014-10-01 2014-10-01      A
# 2:         1     B 2014-10-02 2014-10-02 hybrid
# 3:         1     B 2014-10-03 2014-10-03      B
# 4:         2     A 2014-10-04 2014-10-04      A
# 5:         2     B 2014-10-05 2014-10-05 hybrid
# 6:         2     A 2014-10-06 2014-10-06 hybrid
# 7:         6     A 2014-10-07 2014-10-07      A
# 8:         6     B 2014-10-08 2014-10-08 hybrid
# 9:         6     B 2014-10-09 2014-10-09      B

# roll days = 2L
dt[, type := games][hybrid_index(dt, 2L), type := "hybrid"]
#    player_id games       date   end_date   type
# 1:         1     A 2014-10-01 2014-10-01      A
# 2:         1     B 2014-10-02 2014-10-02 hybrid
# 3:         1     B 2014-10-03 2014-10-03 hybrid
# 4:         2     A 2014-10-04 2014-10-04      A
# 5:         2     B 2014-10-05 2014-10-05 hybrid
# 6:         2     A 2014-10-06 2014-10-06 hybrid
# 7:         6     A 2014-10-07 2014-10-07      A
# 8:         6     B 2014-10-08 2014-10-08 hybrid
# 9:         6     B 2014-10-09 2014-10-09 hybrid

To illustrate the idea clearly, I've created a function and copied dt inside the function. But you can avoid that and add the dates in ivals directly to dt and make use of by.x and by.y arguments in foverlaps(). Please look at ?foverlaps.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!