问题
Hi I am giving labels to my data frame manually like below, I have 800 columns to be labeled , after that I am creating a subset of data frame (sub setting of data have many), then applying that data frame to function for calculation.
labels can be different for all chunks , also its very time taking for creating labels one by one for all chunks.
data<-data.frame( col1=c(1,1,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,1,1,1,NA,1,1,NA,NA,NA,NA,1,NA,NA,NA,NA,1,NA,1),
col2=c(1,1,1,1,1,NA,NA,NA,NA,1,1,1,1,1,NA,NA,NA,1,1,1,NA,1,1,1,1,1,NA,NA,NA,1,1,1,1,1,1,1,NA,NA,NA),
col3=c(1,1,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,1,1,1,NA,NA,NA,1,NA,NA,1,1,1,1,1,NA,NA,1),
col4=c(1,NA,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
col5=c(1,2,1,1,1,2,1,2,2,1,2,NA,1,1,2,2,2,1,1,1,2,NA,2,1,1,1,2,2,2,NA,1,2,2,1,1,1,2,2,2)
)
data$col5<-factor(data$col5, levels=c(1,2), labels=c("Local","Overseas"))
df<- data
df$cc1<-1
df2<- subset(df, col5 == 'Local')
df$cc2<-ifelse(df$col5 == 'Local',1,NA)
lst<-list(df$cc1, df$cc2)
ldat<-list("ALL" = df, "Local" =df2)
col_names <- c("col1","col2"...."col4")
labels <- c("Sales","Ops"...."HR")
make_mutator <- function(x) {
paste0(
"factor(", names(faclist)[[x]],
",labels=c('",
paste0(faclist[[x]],
collapse = "','"
), "'))"
)
}
list_of_fac <- purrr::map_chr(seq_len(length(faclist)),
make_mutator)
names(list_of_fac) <- names(faclist)
ldat <- purrr::map(ldat,
~mutate(.,
!!!parse_exprs(list_of_fac)))
This is perfectly fine and working for me ....but just want new solution if i will give columns and labels separately for columns and labels like
col_names <- c("col1","col2"...."col4") labels <- c("Sales","Ops"...."HR")
then how can i change my function for this....??
回答1:
Instead of the parsing, an easier option is to use map2 after looping over the list with map. With map2, we pass the columns of interest and the labels to be changed based on the named list 'faclist'
library(dplyr)
library(purrr)
ldat1 <- map(ldat, ~ {
.x[names(faclist)] <- map2(.x %>%
dplyr::select(names(faclist)),
faclist, ~ factor(.x, labels= .y))
.x} )
-output
str(ldat1[[1]])
#'data.frame': 39 obs. of 7 variables:
# $ col1: Factor w/ 1 level "Sales": 1 1 NA NA NA NA NA NA 1 NA ...
# $ col2: Factor w/ 1 level "OPS": 1 1 1 1 1 NA NA NA NA 1 ...
# $ col3: Factor w/ 1 level "Management": 1 1 NA NA NA NA NA 1 NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA NA ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 2 1 1 1 2 1 2 2 1 ...
# $ cc1 : num 1 1 1 1 1 1 1 1 1 1 ...
# $ cc2 : num 1 NA 1 1 1 NA 1 NA NA 1 ...
str(ldat1[[2]])
#'data.frame': 18 obs. of 6 variables:
# $ col1: Factor w/ 1 level "Sales": 1 NA NA NA NA NA NA NA 1 NA ...
#$ col2: Factor w/ 1 level "OPS": 1 1 1 1 NA 1 1 1 1 1 ...
# $ col3: Factor w/ 1 level "Management": 1 NA NA NA NA NA NA NA NA NA ...
# $ col4: Factor w/ 1 level "HR": 1 NA NA NA NA NA NA NA NA 1 ...
# $ col5: Factor w/ 2 levels "Local","Overseas": 1 1 1 1 1 1 1 1 1 1 ...
# $ cc1 : num 1 1 1 1 1 1 1 1 1 1 ...
If it is not a list, but two vectors, then just change the names(faclist) with the 'col_names' vector and the list 'faclist' with labels vector
ldat1 <- map(ldat, ~ {
.x[col_names] <- map2(.x %>%
dplyr::select(col_names),
labels, ~ factor(.x, labels= .y))
.x} )
来源:https://stackoverflow.com/questions/64274389/converting-columns-of-list-of-data-frame-to-factor