Counting consecutive values in rows in R

别来无恙 提交于 2021-01-29 08:56:37

问题


I have a time series and panel data data frame with a specific ID in the first column, and a weekly status for employment: Unemployed (1), employed (0).

I have 261 variables (the weeks every year) and 1.000.000 observations.

I would like to count the maximum number of times '1' occurs consecutively for every row in R.

I have looked a bit at rowSums and rle(), but I am not as far as I know interested in the sum of the row, as it is very important the values are consecutive.

You can see an example of the structure of my data set here - just imagine more rows and columns


回答1:


We can write a little helper function to return the maximum number of times a certain value is consecutively repeated in a vector, with a nice default value of 1 for this use case

most_consecutive_val = function(x, val = 1) {
  with(rle(x), max(lengths[values == val]))
}

Then we can apply this function to the rows of your data frame, dropping the first column (and any other columns that shouldn't be included):

apply(your_data_frame[-1], MARGIN = 1, most_consecutive_val)

If you share some easily imported sample data, I'll be happy to help debug in case there are issues. dput is an easy way to share a copy/pasteable subset of data, for example dput(your_data[1:5, 1:10]) would be a great way to share the first 5 rows and 10 columns of your data.


If you want to avoid warnings and -Inf results in the case where there are no 1s, use Ryan's suggestion from the comments:

most_consecutive_val = function(x, val = 1) {
  with(rle(x), if(all(values != val)) 0 else max(lengths[values == val]))
}


来源:https://stackoverflow.com/questions/50748483/counting-consecutive-values-in-rows-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!