Get the (t-1) data within groups

前端 未结 1 1730
挽巷
挽巷 2020-12-19 13:58

Apologies if this has been asked before, but I couldn\'t find any question which answers this exactly. I have a data like this:

Project        Date   price
          


        
1条回答
  •  借酒劲吻你
    2020-12-19 14:14

    Here's an option. I'd also recommend to use NAs instead if 0 because 0 could be actual price.

    library(dplyr)
    df %>% 
      arrange(as.Date(Date, format = "%d/%m/%Y")) %>%
      group_by(Project) %>%
      mutate(lastPrice = lag(price))
    
    # Source: local data frame [5 x 4]
    # Groups: Project
    # 
    #   Project      Date price lastPrice
    # 1       B 22/2/2013  1642        NA
    # 2       B 19/3/2013  1567      1642
    # 3       A 30/3/2013  2082        NA
    # 4       C 12/4/2013  1575        NA
    # 5       C  5/6/2013  1582      1575
    

    Another option is to use shift from the devel version of data.table

    library(data.table) ## v >= 1.9.5
    setDT(df)[order(as.Date(Date, format = "%d/%m/%Y")), 
                    lastPrice := shift(price), 
                    by = Project]
    
    #    Project      Date price lastPrice
    # 1:       A 30/3/2013  2082        NA
    # 2:       B 19/3/2013  1567      1642
    # 3:       B 22/2/2013  1642        NA
    # 4:       C 12/4/2013  1575        NA
    # 5:       C  5/6/2013  1582      1575
    

    Or with base R

    df <- df[order(df$Project, as.Date(df$Date, format = "%d/%m/%Y")), ]
    within(df, lastPrice <- ave(price, Project, FUN = function(x) c(NA, x[-length(x)])))
    #   Project      Date price lastPrice
    # 1       A 30/3/2013  2082        NA
    # 3       B 22/2/2013  1642        NA
    # 2       B 19/3/2013  1567      1642
    # 4       C 12/4/2013  1575        NA
    # 5       C  5/6/2013  1582      1575
    

    As a side note, it is better to keep your date column in a Date class in the first place, so I'd recommend doing df$Date <- as.Date(df$Date, format = "%d/%m/%Y") once and for all.

    0 讨论(0)
提交回复
热议问题