converting columns of list of data frame to factor

巧了我就是萌 提交于 2020-12-06 18:40:53

问题


Hi I am giving labels to my data frame manually like below, I have 800 columns to be labeled , after that I am creating a subset of data frame (sub setting of data have many), then applying that data frame to function for calculation.

labels can be different for all chunks , also its very time taking for creating labels one by one for all chunks.

data<-data.frame( col1=c(1,1,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,1,1,1,NA,1,1,NA,NA,NA,NA,1,NA,NA,NA,NA,1,NA,1),
                  col2=c(1,1,1,1,1,NA,NA,NA,NA,1,1,1,1,1,NA,NA,NA,1,1,1,NA,1,1,1,1,1,NA,NA,NA,1,1,1,1,1,1,1,NA,NA,NA),
                  col3=c(1,1,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,1,1,1,NA,NA,NA,1,NA,NA,1,1,1,1,1,NA,NA,1),
                  col4=c(1,NA,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
                  col5=c(1,2,1,1,1,2,1,2,2,1,2,NA,1,1,2,2,2,1,1,1,2,NA,2,1,1,1,2,2,2,NA,1,2,2,1,1,1,2,2,2)
)  

data$col5<-factor(data$col5, levels=c(1,2), labels=c("Local","Overseas"))

df<- data
df$cc1<-1
df2<- subset(df, col5 == 'Local')
df$cc2<-ifelse(df$col5 == 'Local',1,NA)
lst<-list(df$cc1, df$cc2)
ldat<-list("ALL" = df, "Local" =df2)

col_names <- c("col1","col2"...."col4")
    labels <- c("Sales","Ops"...."HR")

make_mutator <- function(x) {
  paste0(
    "factor(", names(faclist)[[x]],
    ",labels=c('",
    paste0(faclist[[x]],
           collapse = "','"
    ), "'))"
  )
}


list_of_fac <- purrr::map_chr(seq_len(length(faclist)),
                              make_mutator)

names(list_of_fac) <- names(faclist)

ldat <- purrr::map(ldat,
                   ~mutate(.,
                           !!!parse_exprs(list_of_fac)))

This is perfectly fine and working for me ....but just want new solution if i will give columns and labels separately for columns and labels like

col_names <- c("col1","col2"...."col4") labels <- c("Sales","Ops"...."HR")

then how can i change my function for this....??


回答1:


Instead of the parsing, an easier option is to use map2 after looping over the list with map. With map2, we pass the columns of interest and the labels to be changed based on the named list 'faclist'

library(dplyr)
library(purrr)
ldat1 <- map(ldat, ~  {
     .x[names(faclist)] <- map2(.x %>% 
                             dplyr::select(names(faclist)), 
                         faclist, ~ factor(.x, labels= .y))
       .x} )

-output

str(ldat1[[1]])
#'data.frame':  39 obs. of  7 variables:
# $ col1: Factor w/ 1 level "Sales": 1 1 NA NA NA NA NA NA 1 NA ...
# $ col2: Factor w/ 1 level "OPS": 1 1 1 1 1 NA NA NA NA 1 ...
# $ col3: Factor w/ 1 level "Management": 1 1 NA NA NA NA NA 1 NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA NA ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 2 1 1 1 2 1 2 2 1 ...
# $ cc1 : num  1 1 1 1 1 1 1 1 1 1 ...
# $ cc2 : num  1 NA 1 1 1 NA 1 NA NA 1 ...
str(ldat1[[2]])
#'data.frame':  18 obs. of  6 variables:
# $ col1: Factor w/ 1 level "Sales": 1 NA NA NA NA NA NA NA 1 NA ...
#$ col2: Factor w/ 1 level "OPS": 1 1 1 1 NA 1 1 1 1 1 ...
# $ col3: Factor w/ 1 level "Management": 1 NA NA NA NA NA NA NA NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA 1 ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 1 1 1 1 1 1 1 1 1 ...
# $ cc1 : num  1 1 1 1 1 1 1 1 1 1 ...

If it is not a list, but two vectors, then just change the names(faclist) with the 'col_names' vector and the list 'faclist' with labels vector

ldat1 <- map(ldat, ~  {
     .x[col_names] <- map2(.x %>% 
                             dplyr::select(col_names), 
                         labels, ~ factor(.x, labels= .y))
       .x} )


来源:https://stackoverflow.com/questions/64274389/converting-columns-of-list-of-data-frame-to-factor

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!