Select rows around a marker R [duplicate]

自闭症网瘾萝莉.ら 提交于 2021-01-29 21:30:52

问题


I'm trying to select 100 rows before and after a marker in a relatively large dataframe. The markers are sparse and for some reason I haven't been able to figure it out or find a solution - this doesn't seem like it should be that hard, so I'm probably missing something obvious.

Here's a very small simple example of what the data looks like:

timestamp talking_yn transition_yn
0.01      n          n
0.02      n          n
0.03      n          n
0.04      n          n
0.05      n          n
0.06      n          n
0.07      n          n
0.08      n          n
0.09      n          n
0.10      n          n
0.11      y          y
0.12      y          n
0.13      y          n
0.14      y          n
0.15      y          n
0.16      y          n
0.17      y          n
0.18      y          n

I've tried using different methods from a variety of answers (lag from zoo or dplyr), but they all focus on selecting one row or subsetting only those rows with the marker. For the dummy example data, how would I select the 5 rows before and after the transition == 'y' row?


回答1:


I have a quick function for that:

#' Lead/Lag a logical
#'
#' @param lgl logical vector
#' @param bef integer, number of elements to lead by
#' @param aft integer, number of elements to lag by
#' @return logical, same length as 'lgl'
#' @export
leadlag <- function(lgl, bef = 1, aft = 1) {
  n <- length(lgl)
  bef <- min(n, max(0, bef))
  aft <- min(n, max(0, aft))
  befx <- if (bef > 0) sapply(seq_len(bef), function(b) c(tail(lgl, n = -b), rep(FALSE, b)))
  aftx <- if (aft > 0) sapply(seq_len(aft), function(a) c(rep(FALSE, a), head(lgl, n = -a)))
  rowSums(cbind(befx, lgl, aftx), na.rm = TRUE) > 0
}

dat[leadlag(dat$transition_yn == 'y', 2, 4),]
#    timestamp talking_yn transition_yn
# 9       0.09          n             n
# 10      0.10          n             n
# 11      0.11          y             y
# 12      0.12          y             n
# 13      0.13          y             n
# 14      0.14          y             n
# 15      0.15          y             n

Data

dat <- read.table(header=TRUE, stringsAsFactor=FALSE, text="
timestamp talking_yn transition_yn
0.01      n          n
0.02      n          n
0.03      n          n
0.04      n          n
0.05      n          n
0.06      n          n
0.07      n          n
0.08      n          n
0.09      n          n
0.10      n          n
0.11      y          y
0.12      y          n
0.13      y          n
0.14      y          n
0.15      y          n
0.16      y          n
0.17      y          n
0.18      y          n")


来源:https://stackoverflow.com/questions/58716917/select-rows-around-a-marker-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!