na.locf using group_by from dplyr

半世苍凉 提交于 2019-12-10 14:24:43

问题


I'm trying to use na.locf from package zoo with grouped data using dplyr. I'm using the first solution on this question: Using dplyr window-functions to make trailing values (fill in NA values)

library(dplyr);library(zoo)
df1 <- data.frame(id=rep(c("A","B"),each=3),problem=c(1,NA,2,NA,NA,NA),ok=c(NA,3,4,5,6,NA))
df1
  id problem ok
1  A       1 NA
2  A      NA  3
3  A       2  4
4  B      NA  5
5  B      NA  6
6  B      NA NA

The problem happens when, within a group, all the data is NA. As you can see in the problem column, the na.locf data for id=B comes from another group: the last data of id=A.

df1 %>% group_by(id) %>% na.locf()

Source: local data frame [6 x 3]
Groups: id [2]

     id problem    ok
  <chr>   <chr> <chr>
1     A       1  <NA>
2     A       1     3
3     A       2     4
4     B       2     5 #problem col is wrong
5     B       2     6 #problem col is wrong
6     B       2     6 #problem col is wrong

This is my expected result. The data for id=B is independent of what is in id=A

     id problem    ok
  <chr>   <chr> <chr>
1     A       1  <NA>
2     A       1     3
3     A       2     4
4     B       NA     5
5     B       NA     6
6     B       NA     6

回答1:


We need to use na.locf within mutate_all as na.locf can be applied directly on the dataset. Eventhough it is grouped by 'id', applying na.locf by applying on the full dataset is not following any group by behavior

df1 %>%
     group_by(id) %>%
     mutate_all(funs(na.locf(., na.rm = FALSE)))
#    id problem    ok
#  <fctr>   <dbl> <dbl>
#1      A       1    NA
#2      A       1     3
#3      A       2     4
#4      B      NA     5
#5      B      NA     6
#6      B      NA     6


来源:https://stackoverflow.com/questions/43212308/na-locf-using-group-by-from-dplyr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!