Using a loop (or vectorisation) to subset a list by multiple elements in a vector

风流意气都作罢 提交于 2019-12-23 18:28:46

问题


I have a list of 3 data.frames:

my_list <- list(a = data.frame(value = c(1:5), class = c(letters[1:3],"a", "b")), b = data.frame (value = c(6:1),class=c(letters[1:4],"a", "b")),c=data.frame(value = c(1:7),class = c(letters[5:1],"a", "b")))

my_list

$a
  value class
1     1     a
2     2     b
3     3     c
4     4     a
5     5     b

$b
  value class
1     6     a
2     5     b
3     4     c
4     3     d
5     2     a
6     1     b

$c
  value class
1     1     e
2     2     d
3     3     c
4     4     b
5     5     a
6     6     a
7     7     b 

I want to go in to each list and subset them by letters a and b from the class column:

wanted_sub_class <- c("a", "b")

and then put the results in a list of my_list per class.

Edit - Expected output:

$a class a
    value class
       1     a
       4     a

$a class b 
    value class
       2     b
       5     b

$b class a
    value class
      4     a
      2     a

$b class b
   value class
      5     b
      1     b
$c class a
  value class
    5     a
    6     b

$c class b
  value class
     4     b
     7     b

I've tried to do it with a double loop:

result <- list()

for (i in 1:length(my_list)) {
  for (j in wanted_sub_class {

    result [[i]] <- subset(my_list[[i]], my_list[[i]]$class == j)

  }
}

This should give me 6 list elements (as per expected output) but it only gives 3 and only of element b.

Ideally, however, if it's actually possible, I want to put the results in a list of my_list per class. So I want to keep the structure of the 3 data.frames in the list and then have a list with in that with the data of class a and b - Otherwise, a list of six will work

I understand loops aren't ideal but I can't really get my head around vecortisation (e.g. using lapply). I would appreciate an answer for both loop (if it's possible) and vectorization.


回答1:


If we are using purrr from the Hadleyverse family of packages

library(purrr)
my_list %>% 
      map(~ .[.$class %in% wanted_sub_class,])
#$a
#   value class
#1     1     a
#2     2     b

#$b
#  value class
#1     4     a
#2     3     b

#$c
#  value class
#4     4     b
#5     5     a

Or if the output needs to have only 'a' and 'b' list elements

library(dplyr)
my_list %>%
       bind_rows %>%
       filter(class %in% wanted_sub_class) %>% 
       split(., .$class)
#$a
#  value class
#1     1     a
#3     4     a
#6     5     a

#$b
#  value class
#2     2     b
#4     3     b
#5     4     b

Update

Based on the OP's update

my_list %>%
       map(~ .[.$class %in% wanted_sub_class,]) %>%
       map(~split(.x, seq_len(nrow(.x)))) %>%
       do.call("c", .)
#$a.1
#  value class
#1     1     a

#$a.2
#  value class
#2     2     b

#$b.1
#  value class
#1     4     a

#$b.2
#  value class
#2     3     b

#$c.1
#  value class
#4     4     b

#$c.2
#  value class
#5     5     a

Or using the bind_rows approach

my_list %>%
    bind_rows %>%
    filter(class %in% wanted_sub_class) %>% 
    split(., seq_len(nrow(.)))

Update2

If we need a for loop

result <- setNames(vector('list', length(my_list)), names(my_list))
for(i in seq_along(my_list)){
  result[[i]] <- subset(my_list[[i]], class %in% wanted_sub_class)
  result[[i]] <- split(result[[i]], 1:nrow(result[[i]]))
 }

Update3

For the new output format

 my_list %>% 
     bind_rows(.id = "id")  %>%
     filter(class %in% wanted_sub_class) %>% 
     split(., list(.$id, .$class))

Or using the for loop

result <- setNames(vector('list', length(my_list)), names(my_list))
for(i in seq_along(my_list)){
  result[[i]] <- subset(my_list[[i]], class %in% wanted_sub_class)
  result[[i]] <- split(result[[i]], result[[i]]$class, drop = TRUE)
}



回答2:


I want to go in to each list and subset them by letters a and b from the class column

Should you want to subset your list of data.frames by class you could simply do:

lapply(my_list, function(x) { subset(x, class %in% c("a", "b")) }) 

Which gives:

#$a
#  value class
#1     1     a
#2     2     b
#
#$b
#  value class
#1     4     a
#2     3     b
# 
#$c
#  value class
#4     4     b
#5     5     a

Update: After re-reading your question, from what I understand, you would prefer to reshape your actual list by class:

Ideally, however, I want to put the results in a list of my_list per class but I don't know how to do this in a loop.

You could try:

library(dplyr)

data.table::melt(my_list) %>%
  filter(class %in% c("a", "b")) %>%
  select(class, value) %>%
  split(as.character(.$class))

Which gives:

#$a
#  class value
#1     a     1
#3     a     4
#6     a     5
#
#$b
#  class value
#2     b     2
#4     b     3
#5     b     4

As per mentionned by @Sumedeh (in a now deleted comment), you could also use purrr:

library(purrr)
my_list %>% 
  map_df(function(x) x[x$class %in% c("a", "b"), ]) %>% 
  split(.$class)

Which gives:

#$a
#  value class
#1     1     a
#3     4     a
#6     5     a

#$b
#  value class
#2     2     b
#4     3     b
#5     4     b


来源:https://stackoverflow.com/questions/38954154/using-a-loop-or-vectorisation-to-subset-a-list-by-multiple-elements-in-a-vecto

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!