Replace the same values in the consecutive rows and stop replacing once the value has changed in R

问题

I want to find a way to replace consecutive same values into 0 at the beginning of each trial, but once the value has changed it should stop replacing and keep the value. It should occur every trials per subject.

For example, first subject has multiple trials (1, 2, etc). At the beginning of each trial, there may be some consecutive rows with the same value (e.g., 1, 1, 1). For these values, I would like to replace them to 0. However, once the value has changed from 1 to 0, I want to keep the values in the rest of the trial (e.g., 0, 0, 1).

subject <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
trial <- c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2)
value <- c(1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1)
df <- data.frame(subject, trial, value)

Thus, from the original data frame, I would like to have a new variable (value_new) like below.

   subject trial value value_new
1        1     1     1         0
2        1     1     1         0
3        1     1     1         0
4        1     1     0         0
5        1     1     0         0
6        1     1     1         1
7        1     2     1         0
8        1     2     1         0
9        1     2     0         0
10       1     2     1         1
11       1     2     1         1
12       1     2     1         1

I was thinking to use tidyr and group_by(subject, trial) and mutate a new variable using conditional statement, but no idea how to do that. I guess I need to use rle(), but again, have no clue of how to replace the consecutive values into 0, and stop replacing once the value has changed and keep the rest of the values.

Any suggestions or advice would be really appreciated!

回答1:

You can use rleid from data.table :

library(data.table)
setDT(df)[, new_value := value * +(rleid(value) > 1), .(subject, trial)]
df

#    subject trial value new_value
# 1:       1     1     1         0
# 2:       1     1     1         0
# 3:       1     1     1         0
# 4:       1     1     0         0
# 5:       1     1     0         0
# 6:       1     1     1         1
# 7:       1     2     1         0
# 8:       1     2     1         0
# 9:       1     2     0         0
#10:       1     2     1         1
#11:       1     2     1         1
#12:       1     2     1         1

You can also do this with dplyr :

library(dplyr)

df %>%
  group_by(subject, trial) %>%
  mutate(new_value = value * +(rleid(value) > 1))

来源：https://stackoverflow.com/questions/62808087/replace-the-same-values-in-the-consecutive-rows-and-stop-replacing-once-the-valu

标签

replace

conditional-statements