Iterate through data tables

匆匆过客 提交于 2020-01-07 03:45:09

问题


I have 3 tables as

tbl.1 <- data.table("A" = runif(5), "B" = runif(5))
tbl.2 <- data.table("A" = runif(5), "B" = runif(5))
tbl.3 <- data.table("A" = runif(5), "B" = runif(5))

I would like to iterate through the tables with a loop such as

for (i in 1:3) {
  # Open tbl.i
  # Do something
}

How can this be done? I can put the tables on a list an iterate through the list which works OK. However, I am trying to keep the tables as unique objects for various reasons. Thanks.


回答1:


If you don't want to keep data.tables in a list. You can refer to them in your environment. In this example it is a global environment. If your data.tables will be populated inside some other package then you would need to change the environment.

library(data.table)
tbl.1 <- data.table("A" = runif(5), "B" = runif(5))
tbl.2 <- data.table("A" = runif(5), "B" = runif(5))
tbl.3 <- data.table("A" = runif(5), "B" = runif(5))
for (i in paste0("tbl.",1:3)) {
    # Open tbl.i: get
    # Do something: str
    str(get(i, envir = .GlobalEnv))
}



回答2:


As others have already indicated, this doesn't seem to be the "data.table" way of doing things, and since you have not been very clear about what you are doing when you say "do something", it's hard to make a good recommendation.

That said, a for loop could be fine if your "do something" is all about assignment by reference (for instance, using set or :=).

That could be done with a simple:

tbl.1 <- data.table("A" = runif(5), "B" = runif(5))
tbl.2 <- data.table("A" = runif(5), "B" = runif(5))
tbl.3 <- data.table("A" = runif(5), "B" = runif(5))

x <- ls(pattern = "tbl")

for (i in seq_along(x)) {
  get(x[i])[, C := A + B]
}

tbl.2

If you're not dealing with something that would be solved with assignment by reference, for instance you are subsetting or summarizing your data and want to replace the original data.table, then you'll need to use get and assign. (Ugh.)

tbl.1 <- data.table("A" = runif(5), "B" = runif(5))
tbl.2 <- data.table("A" = runif(5), "B" = runif(5))
tbl.3 <- data.table("A" = runif(5), "B" = runif(5))

x <- ls(pattern = "tbl")

for (i in seq_along(x)) {
  assign(x[i], get(x[i])[1, ])
}



回答3:


LDBerriz,

I believe it is possible to do what you are trying to do by looping through variable names and getting them from .GlobalEnv, which represents the workspace.

However, I suggest, as several other commenters have, it's far easier to store your tables in a list, and loop over the list, than it is to loop over variables in .GlobalEnv:

tbl.1 <- data.table("A" = runif(5), "B" = runif(5))
tbl.2 <- data.table("A" = runif(5), "B" = runif(5))
tbl.3 <- data.table("A" = runif(5), "B" = runif(5))

tblList <- list(tbl.1, tbl.2, tbl.3)

for (i in 1:3) {
  tbl <- tblList[[i]]
  # Do something with tbl.
}

For the sake of this answer, I assume that the tables are actually different, or there is some reason you have, that they needs to be separate tables. Of course if the columns of the tables were all the same sort of data/variables, as tbl.1, tbl.2, and tbl.3 in your example are, then you could just combine them into one table and do stuff to the one table:

masterTbl <- rbind(tbl.1,tbl.2,tbl.3)

You could even add a column to them so you can identify which table they originally came from, should you need to:

tbl.1$from <- 1
tbl.2$from <- 2
tbl.3$from <- 3

masterTbl <- rbind(tbl.1,tbl.2,tbl.3)

Best, Ben.




回答4:


Alternatively, one could just use the ls() command in connection with a pattern, so that one just directly selects the desired tables. Found that to be a little easier and more versatile. I also had the issue that the combined data.tables would be too huge, so I had to resort to split them up and thus accessing them separately.

 for (tbl in ls(pattern = glob2rx("tbl.*"))) {
    str(get(tbl))
 }


来源:https://stackoverflow.com/questions/34796713/iterate-through-data-tables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!