Subset observations that differ by at least 30 minutes time

后端 未结 3 685
忘掉有多难
忘掉有多难 2020-12-15 21:16

I have a data.table (~30 million rows) consisting of a datetime column in POSIXct format, an id column and a few other co

3条回答
  •  爱一瞬间的悲伤
    2020-12-15 21:50

    Using Rcpp:

    library(Rcpp)
    library(inline)
    cppFunction(
      'LogicalVector selecttimes(const NumericVector x) {
       const int n = x.length();
       LogicalVector res(n);
       res(0) = true;
       double testval = x(0);
       for (int i=1; i 30 * 60) {
          testval = x(i);
          res(i) = true;
        }
       }
       return res;
      }')
    
    DT[, keep1 := selecttimes(datetime), by = id]
    
    DT[, all(keep == keep1)]
    #[1] TRUE
    

    Some additional testing should be done, it needs input validation, and the time difference could be made a parameter.

提交回复
热议问题