问题
I have a df data
and I would like to add to a new column a value that exist in a previous column and row if the factor is the same.
Here is a sample:
data <- structure(list(Id = c("a", "b", "b", "b", "a", "a", "b", "b",
"a", "a"), duration.minutes = c(NA, 139L, 535L, 150L, NA, NA,
145L, 545L, 144L, NA), event = structure(c(1L, 4L, 3L, 4L, 2L,
1L, 4L, 3L, 4L, 2L), .Label = c("enter", "exit", "stop", "trip"
), class = "factor")), .Names = c("Id", "duration.minutes", "event"
), class = "data.frame", row.names = 265:274)
and I would like to add a new column called "duration.minutes.past" like this:
data <- structure(list(Id = c("a", "b", "b", "b", "a", "a", "b", "b",
"a", "a"), duration.minutes = c(NA, 139L, 535L, 150L, NA, NA,
145L, 545L, 144L, NA), event = structure(c(1L, 4L, 3L, 4L, 2L,
1L, 4L, 3L, 4L, 2L), .Label = c("enter", "exit", "stop", "trip"
), class = "factor"), duration.minutes.past = c(NA, NA, 139,
NA, NA, NA, NA, 145, NA, NA)), .Names = c("Id", "duration.minutes",
"event", "duration.minutes.past"), row.names = 265:274, class = "data.frame")
As you can see, I added in this new column duration.minutes.past
the duration.minutes
of the previous trip
for the same Id
. if the Id
is different or if is it not a stop, then the value for duration.minutes.past
is NA.
Help is much appreciated!
回答1:
We can do this with data.table
. Convert the 'data.frame' to 'data.table' (setDT(data)
), grouped by 'Id', we create the lag
column of 'duration.minutes' using shift
), then change the value to 'NA' where the 'event' is not equal to 'stop'
library(data.table)
setDT(data)[, duration.minutes.past := shift(duration.minutes),
Id][event != "stop", duration.minutes.past := NA][]
data
# Id duration.minutes event duration.minutes.past
#1: a NA enter NA
#2: b 139 trip NA
#3: b 535 stop 139
#4: b 150 trip NA
#5: a NA exit NA
#6: a NA enter NA
#7: b 145 trip NA
#8: b 545 stop 145
#9: a 144 trip NA
#10: a NA exit NA
Or this can be done with base R
using ave
data$duration.minutes.past <- with(data, NA^(event != "stop") *
ave(duration.minutes, Id, FUN = function(x) c(NA, x[-length(x)])))
回答2:
A possible solution using dplyr
,
library(dplyr)
df %>%
group_by(Id) %>%
mutate(new = replace(lag(duration.minutes), event != 'stop', NA))
#Source: local data frame [10 x 4]
#Groups: Id [2]
# Id duration.minutes event new
# <chr> <int> <fctr> <int>
#1 a NA enter NA
#2 b 139 trip NA
#3 b 535 stop 139
#4 b 150 trip NA
#5 a NA exit NA
#6 a NA enter NA
#7 b 145 trip NA
#8 b 545 stop 145
#9 a 144 trip NA
#10 a NA exit NA
来源:https://stackoverflow.com/questions/43467401/add-value-from-previous-row-under-conditions