R dplyr. Filter a dataframe that contains a column of numeric vectors

丶灬走出姿态 提交于 2021-02-19 04:22:07

问题


I have a dataframe in which one column contains numeric vectors. I want to filter rows based on a condition involving that column. This is a simplified example.

df <- data.frame(id = LETTERS[1:3], name=c("Alice", "Bob", "Carol"))
mylist=list(c(1,2,3), c(4,5), c(1,3,4))  
df$numvecs <- mylist
df
#   id  name   numvecs
# 1  A  Alice  1, 2, 3
# 2  B  Bob    4, 5
# 3  C  Carol  1, 3, 4

I can use something like mapply e.g.

mapply(function(x,y) x=="B" & 4 %in% y, df$id, df$numvecs)

which correctly returns TRUE for the second row, and FALSE for rows 1 and 2.

However, I have reasons why I'd like to use dplyr filter instead of mapply, but I can't get dplyr filter to operate correctly on the numvecs column. Instead of returning two rows, the following returns no rows.

filter(df, 4 %in% numvecs)
# [1] id      numvecs
#    <0 rows> (or 0-length row.names)

What am I missing here? How can I filter on a conditional expression involving the numvecs column?

And ideally I'd like to use the non-standard evaluation filter_ as well, so I can pass the filter condition as an argument. Any help appreciated. Thanks.


回答1:


We can still use mapply with filter

filter(df, mapply(function(x,y) x == "B" & 4 %in% y, id, numvecs))
#  id name numvecs
#1  B  Bob    4, 5

Or use map from purrr

library(purrr)
filter(df, unlist(map(numvecs, ~4 %in% .x)))
#  id  name numvecs
#1  B   Bob    4, 5
#2  C Carol 1, 3, 4

Or we can also do this in chain

df %>%
    .$numvecs %>% 
     map( ~ 4 %in% .x) %>%
     unlist %>% 
     df[.,]
#  id  name numvecs
#2  B   Bob    4, 5
#3  C Carol 1, 3, 4



回答2:


You can use sapply on the numvecs column and create a logic vector for subsetting:

library(dplyr)
filter(df, sapply(numvecs, function(vec) 4 %in% vec), id == "B")
#   id name numvecs
# 1  B  Bob    4, 5

filter(df, sapply(numvecs, function(vec) 4 %in% vec))
#   id  name numvecs
# 1  B   Bob    4, 5
# 2  C Carol 1, 3, 4


来源:https://stackoverflow.com/questions/38677497/r-dplyr-filter-a-dataframe-that-contains-a-column-of-numeric-vectors

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!