extracting a dataframe from a list over many objects

旧城冷巷雨未停 提交于 2021-02-19 05:59:29

问题


I have over a 1000 objects (z) in R, each containing three dataframes (df1, df2, df3) with different structures.

z1$df1z1000$df1

z1$df2z1000$df2

z1$df3z1000$df3

I created a list of these objects (list1 thus contains z1 thru z1000) and tried to use lapply to extract one type of dataframe (df2) for all objects, and then merge them to one single dataframe.

Extraction:

For a single object it would look like this:

df15<- z15$df2 # I transferred the index of z to the extracted df

I tried some code with lapply, ignoring the transfer of the index (I can create another list for that). However I don’t know what function I should use.

List2 <- lapply(list1, function(x))

I try to avoid using a loop because there's so many and vectorization is so much quicker. I have the idea I'm looking at it from the wrong angle.

Subsequent merging can be done as follows:

merged <- do.call(rbind, list2)

Thanks for any suggestions.


回答1:


One option could be using lapply to extract data.frame and then use bind_rows from dplyr.

## The data
df1 <- data.frame(id = c(1:10), name = c(LETTERS[1:10]), stringsAsFactors = FALSE)
df2 <- data.frame(id = 11:20, name = LETTERS[11:20], stringsAsFactors = FALSE)
df3 <- data.frame(id = 21:30, name = LETTERS[15:24], stringsAsFactors = FALSE)
df4 <- data.frame(id = 121:130, name = LETTERS[15:24], stringsAsFactors = FALSE)

z1 <- list(df1 = df1, df2 = df2, df3 = df3)
z2 <- list(df1 = df1, df2 = df2, df3 = df3)
z3 <- list(df1 = df1, df2 = df2, df3 = df3)
z4 <- list(df1 = df1, df2 = df2, df3 = df4) #DFs can contain different data

# z <- list(z1, z2, z3, z4)
# Dynamically populate list z with many list object
z <- as.list(mget(paste("z",1:4,sep="")))


df1_all <- bind_rows(lapply(z, function(x) x$df1))
df2_all <- bind_rows(lapply(z, function(x) x$df2))
df3_all <- bind_rows(lapply(z, function(x) x$df3))


## Result for df3_all
> tail(df3_all)
##    id name
## 35 125    S
## 36 126    T
## 37 127    U
## 38 128    V
## 39 129    W
## 40 130    X



回答2:


It sounds like you want to pull out all the df1s and rbind them together then do the same for the other dataframes. You can use purrr::map_dfr to extract a column from each element of the list and rowbind them together.

library('tidyverse')

dummy_df <- list(
  df1 = iris,
  df2 = cars,
  df3 = CO2)

list1 <- list(
  z1 = dummy_df,
  z2 = dummy_df,
  z3 = dummy_df)

df1 <- map_dfr(list1, 'df1')
df2 <- map_dfr(list1, 'df2')
df3 <- map_dfr(list1, 'df3')

If you wanted to do it in base R, you can use lapply.

df1 <- lapply(list1, function(x) x$df1)
df1_merged <- do.call(rbind, df1)



回答3:


Try this:

lapply(list1, "[[", "df2")

or if you want to rbind them together:

do.call("rbind", lapply(list1, "[[", "df2"))

The row names in the resulting data frame will identify the origin of each row.

No packages are used.

Note

We can use this input to test the code above. BOD is a built-in data frame:

z <- list(df1 = BOD, df2 = BOD, df3 = BOD)
list1 <- list(z1 = z, z2 = z)



回答4:


THere's also data.table::rbindlist, which is likely faster than do.call(rbind, lapply(...)) or dplyr::bind_rows

library(data.table)
rbindlist(lapply(list1, "[[", "df2"))


来源:https://stackoverflow.com/questions/48238039/extracting-a-dataframe-from-a-list-over-many-objects

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!