How to do a complex edit of columns of all data frames in a list?

南笙酒味 提交于 2019-12-24 08:52:10

问题


I have a list of 185 data frames called WaFramesNumeric. Each dataframe has several hundred columns and thousands of rows. I want to edit every data frame, so that it leaves all numeric columns as well as any non-numeric columns that I specify.

Using:

for(i in seq_along(WaFramesNumeric)) {
    WaFramesNumeric[[i]] <- WaFramesNumeric[[i]][,sapply(WaFramesNumeric[[i]],is.numeric)] 
}

successfully makes each dataframe contain only its numeric columns.

I've tried to amend this with lines to add specific columns. I have tried:

for (i in seq_along(WaFramesNumeric)) {
    a <- WaFramesNumeric[[i]]$Device_Name
    WaFramesNumeric[[i]] <- WaFramesNumeric[[i]][,sapply(WaFramesNumeric[[i]],is.numeric)] 
    cbind(WaFramesNumeric[[i]],a)
}

and in an attempt to call the column numbers of all integer columns as well as the specific ones and then combine based on that:

for (i in seq_along(WaFramesNumeric)) {
    f <- which(sapply(WaFramesNumeric[[i]],is.numeric))
    m <- match("Cost_Center",colnames(WaFramesNumeric[[i]]))
    n <- match("Device_Name",colnames(WaFramesNumeric[[i]]))
    combine <- c(f,m,n)
    WaFramesNumeric[[i]][,i,combine]
}

These all return errors and I am stumped as to how I could do this. WaFramesNumeric is a copy of another list of dataframes (WaFramesNumeric <- WaFramesAll) and so I also tried adding the specific columns from the WaFramesAll but this was not successful.

I appreciate any advice you can give and I apologize if any of this is unclear.


回答1:


You are mistakenly assuming that the last commmand in a for loop is meaningful. It is not. In fact, it is being discarded, so since you never assigned it anywhere (the cbind and the indexing of WaFramesNumeric...), it is silently discarded.

Additionally, you are over-indexing your data.frame in the third code block. First, it's using i within the data.frame, even though i is an index within the list of data.frames, not the frame itself. Second (perhaps caused by this), you are trying to index three dimensions of a 2D frame. Just change the last indexing from [,i,combine] to either [,combine] or [combine].

Third problem (though perhaps not seen yet) is that match will return NA if nothing is found. Indexing a frame with an NA returns an error (try mtcars[,NA] to see). I suggest that you can replace match with grep: it returns integer(0) when nothing is found, which is what you want in this case.

for (i in seq_along(WaFramesNumeric)) {
  f <- which(sapply(WaFramesNumeric[[i]], is.numeric))
  m <- grep("Cost_Center", colnames(WaFramesNumeric[[i]]))
  n <- grep("Device_Name", colnames(WaFramesNumeric[[i]]))
  combine <- c(f,m,n)
  WaFramesNumeric[[i]] <- WaFramesNumeric[[i]][combine]
}



回答2:


I'm not sure what you mean by "an attempt to call the column numbers of all integer columns...", but in case you want to go through a list of data frames and select some columns based on some function and keep given a column name you can do like this:

df <- data.frame(a=rnorm(20), b=rnorm(20), c=letters[1:20], d=letters[1:20], stringsAsFactors = FALSE)
WaFramesNumeric <- rep(list(df), 2)

Selector <- function(data, select_func, select_names) {
  select_func <- match.fun(select_func)
  idx_names <- match(select_names, colnames(data))
  idx_names <- idx_names[!is.na(idx_names)]
  idx_func <- which(sapply(data, select_func))
  idx <- unique(c(idx_func, idx_names))
  return(data[, idx])
}

res <- lapply(X = WaFramesNumeric, FUN = Selector, select_names=c("c"), select_func = is.numeric)


来源:https://stackoverflow.com/questions/48971076/how-to-do-a-complex-edit-of-columns-of-all-data-frames-in-a-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!