Using tidyr::complete with group_by

限于喜欢 提交于 2019-11-28 08:44:54

问题


Does anyone know if tidyr::complete() supports grouping via group_by()?

To be precise: I have some data frame that looks like this

df <- data.frame(
  "ID"   = rep(1:2, each = 2),
  "Col1" = c("A", NA, "AA", NA),
  "Col2" = c("B", "C", "BB", "CC"))

Now i'd like to use complete() and group_by() to compute all possible combinations per group!

df %>% 
 group_by(ID) %>% 
 complete(Col1, Col2)

  Error in .Call("dplyr_left_join_impl", PACKAGE = "dplyr", x, y, by_x,  : 
  negative length vectors are not allowed

This causes an error. However, using complete() without grouping works but thats not what i want.

df %>% 
 complete(Col1, Col2)

Questions:

  1. Have I done anything wrong, or does complete() simply not work with group_by?
  2. If so, how could I do this instead (preferably without using a loop)?

回答1:


You could do it using complete and group_by, but you have to use a do statement:

df %>% 
 group_by(ID) %>% 
 do(complete(., Col1, Col2, fill = list(ID = .$ID)))



回答2:


We could do this using data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), and Cross Join (CJ) the unique elements of 'Col1' and 'Col2', grouped by 'ID'.

library(data.table)#v1.9.6+
setDT(df)[,CJ(Col1, Col2, unique=TRUE), by = ID]
#   ID V1 V2
#1:  1 NA  B
#2:  1 NA  C
#3:  1  A  B
#4:  1  A  C
#5:  2 NA BB
#6:  2 NA CC
#7:  2 AA BB
#8:  2 AA CC



回答3:


Just wanted to let everyone know that with the devolopment version of tidyr (version 0.3.1.9000 as of 13.01.2016) all tidyr verbs now respect grouping so a workaround using dplyr::do is not necessary anymore. I will edit my answer once the version is avaiable on CRAN.



来源:https://stackoverflow.com/questions/32973050/using-tidyrcomplete-with-group-by

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!