counting islands in R csv

[亡魂溺海] 提交于 2019-12-24 05:04:48

问题


I would like to count islands along rows in a .csv. I say "islands" meaning consecutive non-blank entries on rows of the .csv. If there are three non-blank entries in a row, I would like that to be counted as 1 island. Anything less than three consecutive entries in a row counts as 1 "non-island". I would then like to write the output to a dataframe:

Name,,,,,,,,,,,,,
Michael,,,1,1,1,,,,,,,,
Peter,,,,1,1,,,,,,,,,
John,,,,,1,,,,,,,,,

Desired dataframe output:

Name,island,nonisland,
Michael,1,0,
Peter,0,1,
John,0,1,

回答1:


You could use rle like this;

output <- stack(sapply(apply(df, 1, rle), function(x) sum(x$lengths >= 3)))
names(output) <- c("island", "name")

output$nonisland <- 0
output$nonisland[output$island == 0] <- 1
#  island    name nonisland
#1      1 Michael         0
#2      0   Peter         1
#3      0    John         1

Here you run rle across the rows of your data frame. Then look through and add up occurrences when you find lengths of 3 or more.

Note that this solution assumes all islands are made up of the same thing (i.e. all 1's as in your example). If that is not the case, you would need to convert all the non-empty entries to be the same thing by doing something like this: df[!is.na(df)] <- 1 before rle will be appropriate.



来源:https://stackoverflow.com/questions/30654489/counting-islands-in-r-csv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!