问题
I have a sensor that measures a variable and when there is no connection it returns always the last value seen instead of NA
. So in my vector I would like to replace these identical values by an imptuted value (for example with na.approx
).
set.seed(3)
vec <- round(runif(20)*10)
#### [1] 2 8 4 3 6 6 1 3 6 6 5 5 5 6 9 8 1 7 9 3
But I want only the sequences bigger than 2 (3 or more identical numbers) because 2 identical numbers can appear naturally. (in previous example the sequence to tag would be 5 5 5
)
I tried to do it with diff
to tag my identical points (c(0, diff(vec) == 0)
) but I don't know how to deal with the length == 2
condition...
EDIT my expected output could be like this:
#### [1] 2 8 4 3 6 6 1 3 6 6 5 NA NA 6 9 8 1 7 9 3
(The second identical value of a sequence of 3 or more is very probably a wrong value too)
Thanks
回答1:
you can use rle
to get the indices of the positions where NA
should be assigned.
vec[with(data = rle(vec),
expr = unlist(sapply(which(lengths > 2), function(i)
(sum(lengths[1:i]) - (lengths[i] - 2)):sum(lengths[1:i]))))] = NA
vec
#[1] 2 8 4 3 6 6 1 3 6 6 5 NA NA 6 9 8 1 7 9 3
In function
foo = function(X, length){
replace(x = X,
list = with(data = rle(X),
expr = unlist(sapply(which(lengths > length), function(i)
(sum(lengths[1:i]) - (lengths[i] - length)):sum(lengths[1:i])))),
values = NA)
}
foo(X = vec, length = 2)
#[1] 2 8 4 3 6 6 1 3 6 6 5 NA NA 6 9 8 1 7 9 3
回答2:
you can use the lag
function
set.seed(3)
> vec <- round(runif(20)*10)
>
> vec
[1] 2 8 4 3 6 6 1 3 6 6 5 5 5 6 9 8 1 7 9 3
>
> vec[vec == lag(vec) & vec == lag(vec,2)] <- NA
>
> vec
[1] 2 8 4 3 6 6 1 3 6 6 5 5 NA 6 9 8 1 7 9 3
>
来源:https://stackoverflow.com/questions/44574305/replace-sequence-of-identical-values-of-length-2