问题
What's a clever (i.e., not a loop) way to get the length of each spell of missing values in a vector? My ideal output is a vector that is the same length, in which each missing value is replaced by the length of the spell of missing values of which it was a part, and all other values are 0's.
So, for input like:
x <- c(2,6,1,2,NA,NA,NA,3,4,NA,NA)
I'd like output like:
y <- c(0,0,0,0,3,3,3,0,0,2,2)
回答1:
One simple option using rle
:
m <- rle(is.na(x))
> rep(ifelse(m$values,m$lengths,0),times = m$lengths)
[1] 0 0 0 0 3 3 3 0 0 2 2
回答2:
I was independently working on something using rle()
and either cumsum()
or dplyr group_by()
and n()
to get group-lengths of NAs:
> x2 <- as.numeric(is.na(x))
0 0 0 0 1 1 1 0 0 1 1
> rle(x2)
Run Length Encoding
lengths: int [1:4] 4 3 2 2
values : num [1:4] 0 1 0 1
# Now we can assign group-numbers...
> cumsum(c(diff(x2)==+1,0)) * x2
0 0 0 0 1 1 1 0 0 2 2
# ...then get group-lengths from counting those...
> rle(cumsum(c(diff(x2)==+1,0)) * x2)
Run Length Encoding
lengths: int [1:4] 4 3 2 2
values : num [1:4] 0 1 0 2
We could kludge something, but it won't be as compact and elegant as @joran's solution.
回答3:
Here is another option with rleid
and ave
library(data.table)
ave(x, rleid(is.na(x)), FUN = length)*is.na(x)
#[1] 0 0 0 0 3 3 3 0 0 2 2
来源:https://stackoverflow.com/questions/42936958/get-length-of-runs-of-missing-values-in-vector