Chaining dataframes in a list

与世无争的帅哥 提交于 2019-12-12 03:39:26

问题


I have a list of data.frames an example of which can be found in the example.data below

example.data <- list(
  stage1 <- data.frame(stuff=c("Apples","Oranges","Bananas"),
                       Prop1=c(1,2,3),
                       Prop2=c(3,2,1),
                       Wt=c(1,2,3)),
  stage2 <- data.frame(stuff=c("Bananas","Mango","Cherry","Quince","Gooseberry"),
                       Prop1=c(8,9,10,1,2),
                       Prop2=c(23,32,55,5,4),
                       Wt=c(45,23,56,99,2)),
  stage3 <- data.frame(stuff=c("Gooseberry","Bread","Grapes","Butter"),
                       Prop1=c(9,8,9,10),
                       Prop2=c(34,45,67,88),
                       Wt=c(24,56,31,84))
)

The data.frames will always have the same number of columns but their rows will vary, as will the number of data.frames in the list. Notice the chain through the list apples go to bananas, bananas go to gooseberry and gooseberry goes to butter. That is, each pair of data.frames has a common element.

I want to scale-up the weights throughout the whole list as follows. Firstly, I need to input my final weight, say 20e3. Secondly I need a scale factor for the last row, last column of the last data frame: in this particular case this will be 20e3/84 for the last dataframe. I want to use this scale factor at some point to create new columns in the last dataframe.

Next I want to scale between the last dataframe and the previous one. So using the scale factor previously calculated the input for the stage2 is (24*20e3/84) / 2 that is the weight of stage3 Gooseberry multiplied by the scale factor with respect to 20e3 divided by the stage2 Gooseberry weight to give a new scale factor. This process is repeated (via Bananas) to give the stage1 scale factor.

In this particular example the scale factors should be 42858.0 2857.2 238.1 for stage1 stage2 stage3.

I tried doing a for loop over the reverse of the length of the list with appropriate sub-setting after extracting the coordinates of the last element of each data.frame. This failed because the for loop was out of synch. I'm loathe to post what I've tried in case I lead anyone astray.

Not getting many responses so here's what I've done so far ...

last.element <- function(a.list) {

  ## The function finds the last element in a list of dataframes which

  a <- length(a.list) ## required to subset the last element
  x <- dim(a.list[[a]])[1]
  y <- dim(a.list[[a]])[2]

  details <- c(a,x,y)
  return(details)
}

details  <- as.data.frame(matrix(,nrow=length(example.data),ncol=3))

for (i in length(example.data):1) {
  details[i,1:3]  <- last.element(example.data[1:i])
}

The function gives the last element in each of the data.frames down the list. I've set up a data.frame which I want to populate with the scale factor. Next,

details[,4] <- 1

for (i in length(example.data):1) {

  details[i,4]  <- 20e3 / as.numeric(example.data[[i]][as.matrix(details[i,2:3])])

}

I set an extra column in the details data.frame ready for the scale up factors. But the for loop only gives me the last scale factor,

> details
  V1 V2 V3         V4
1  1  3  4  6666.6667
2  2  5  4 10000.0000
3  3  4  4   238.0952

If I multiply 238.0952 by 84 it will give me 20000.

But the scale factor for the second data frame should be (24 * 238.0952) / 2 that is ... all the weights in the third data.frame are multiplied by the scale factor. A new scale factor is derived by dividing the scaled up Gooseberry value in the third data.frame by the Gooseberry value in the second data.frame. The scale factor for the first data frame is found in a similar manner.

来源:https://stackoverflow.com/questions/44034738/chaining-dataframes-in-a-list

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!