Substract date from previous row by group (using R)

柔情痞子 提交于 2020-01-30 12:17:47

问题


I'm having a similar question to this one (subtract value from previous row by group), but I want to subtract the previous date from the current date, by group ID in order to have an estimated number of days. I tried editing the scripts suggesed previously by replacing "value" by "date". Although I tried different suggested methods, but i keep getting this error message "Error in mutate_impl(.data, dots) : Evaluation error: unable to find an inherited method for function first for signature "POSIXct"."

Data
id      date        
2380    10/30/12    
2380    10/31/12    
2380    11/1/12     
2380    11/2/12     
20100   10/30/12    
20100   10/31/12   
20100   11/1/12     
20100   11/2/12     
20103   10/30/12

I'd like to get this kind of table

Data
id      date        date_difference(in days)
2380    10/30/12    0
2380    10/31/12    1
2380    11/1/12     2
2380    11/2/12     3
20100   10/30/12    0
20100   10/31/12    2
20100   11/1/12     3
20100   11/2/12     4
20103   10/30/12    0
20103   10/31/12    1

回答1:


library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

df <- tribble(~id,      ~date,      
2380,    "10/30/12",    
2380,   "10/31/12",    
2380,  "11/1/12",  
2380,    "11/2/12",  
20100,   "10/30/12",    
20100,   "10/31/12",   
20100,   "11/1/12",   
20100,   "11/2/12",   
20103,   "10/30/12",
20103,   "10/31/12")

df %>% 
  mutate(date = mdy(date)) %>% 
  group_by(id) %>% 
  mutate(date_difference = as.numeric(date - first(date)))
#> # A tibble: 10 x 3
#> # Groups:   id [3]
#>       id date       date_difference
#>    <dbl> <date>               <dbl>
#>  1  2380 2012-10-30               0
#>  2  2380 2012-10-31               1
#>  3  2380 2012-11-01               2
#>  4  2380 2012-11-02               3
#>  5 20100 2012-10-30               0
#>  6 20100 2012-10-31               1
#>  7 20100 2012-11-01               2
#>  8 20100 2012-11-02               3
#>  9 20103 2012-10-30               0
#> 10 20103 2012-10-31               1

Created on 2018-11-29 by the reprex package (v0.2.1)




回答2:


First, create a function to compute the day differences

library(stringr)

day_diff <- function(day) {
    days <- difftime(day, "2012-10-30", "days")
    str_extract(days, "\\-*\\d+\\.*\\d*")
}

Then create a new column containing the day differences

df$date_difference <- unlist(lapply(df$date, day_diff))

You may see warnings() of the lack of time zones, but you can signify (or ignore) there.



来源:https://stackoverflow.com/questions/53543747/substract-date-from-previous-row-by-group-using-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!