Add value from previous row under conditions

会有一股神秘感。 提交于 2019-12-12 03:32:33

问题


I have a df data and I would like to add to a new column a value that exist in a previous column and row if the factor is the same.

Here is a sample:

data <- structure(list(Id = c("a", "b", "b", "b", "a", "a", "b", "b", 
"a", "a"), duration.minutes = c(NA, 139L, 535L, 150L, NA, NA, 
145L, 545L, 144L, NA), event = structure(c(1L, 4L, 3L, 4L, 2L, 
1L, 4L, 3L, 4L, 2L), .Label = c("enter", "exit", "stop", "trip"
), class = "factor")), .Names = c("Id", "duration.minutes", "event"
), class = "data.frame", row.names = 265:274)

and I would like to add a new column called "duration.minutes.past" like this:

data <- structure(list(Id = c("a", "b", "b", "b", "a", "a", "b", "b", 
"a", "a"), duration.minutes = c(NA, 139L, 535L, 150L, NA, NA, 
145L, 545L, 144L, NA), event = structure(c(1L, 4L, 3L, 4L, 2L, 
1L, 4L, 3L, 4L, 2L), .Label = c("enter", "exit", "stop", "trip"
), class = "factor"), duration.minutes.past = c(NA, NA, 139, 
NA, NA, NA, NA, 145, NA, NA)), .Names = c("Id", "duration.minutes", 
"event", "duration.minutes.past"), row.names = 265:274, class = "data.frame")

As you can see, I added in this new column duration.minutes.past the duration.minutes of the previous trip for the same Id. if the Id is different or if is it not a stop, then the value for duration.minutes.past is NA.

Help is much appreciated!


回答1:


We can do this with data.table. Convert the 'data.frame' to 'data.table' (setDT(data)), grouped by 'Id', we create the lag column of 'duration.minutes' using shift), then change the value to 'NA' where the 'event' is not equal to 'stop'

library(data.table)
setDT(data)[, duration.minutes.past := shift(duration.minutes), 
             Id][event != "stop", duration.minutes.past := NA][]
data
#    Id duration.minutes event duration.minutes.past
#1:  a               NA enter                    NA
#2:  b              139  trip                    NA
#3:  b              535  stop                   139
#4:  b              150  trip                    NA
#5:  a               NA  exit                    NA
#6:  a               NA enter                    NA
#7:  b              145  trip                    NA
#8:  b              545  stop                   145
#9:  a              144  trip                    NA
#10: a               NA  exit                    NA

Or this can be done with base R using ave

data$duration.minutes.past <- with(data, NA^(event != "stop") * 
      ave(duration.minutes, Id, FUN = function(x) c(NA, x[-length(x)])))



回答2:


A possible solution using dplyr,

library(dplyr)

df %>% 
 group_by(Id) %>% 
 mutate(new = replace(lag(duration.minutes), event != 'stop', NA))

#Source: local data frame [10 x 4]
#Groups: Id [2]

#      Id duration.minutes  event   new
#   <chr>            <int> <fctr> <int>
#1      a               NA  enter    NA
#2      b              139   trip    NA
#3      b              535   stop   139
#4      b              150   trip    NA
#5      a               NA   exit    NA
#6      a               NA  enter    NA
#7      b              145   trip    NA
#8      b              545   stop   145
#9      a              144   trip    NA
#10     a               NA   exit    NA


来源:https://stackoverflow.com/questions/43467401/add-value-from-previous-row-under-conditions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!