list all factor levels of a data.frame

ぃ、小莉子 提交于 2019-12-03 11:06:44

问题


with str(data) I get the headof the levels (1-2 values)

fac1: Factor w/ 2  levels ... :
fac2: Factor w/ 5  levels ... :
fac3: Factor w/ 20 levels ... :
val: num ...

with dplyr::glimpse(data) I get more values, but no infos about number/values of factor-levels. Is there an automatic way to get all level informations of all factor vars in a data.frame? A short form with more info for

levels(data$fac1)
levels(data$fac2)
levels(data$fac3)

or more precisely a elegant version for something like

for (n in names(data))
  if (is.factor(data[[n]])) {
    print(n)
    print(levels(data[[n]]))
  }

thx Christof


回答1:


Here are some options. We loop through the 'data' with sapply and get the levels of each column (assuming that all the columns are factor class)

sapply(data, levels)

Or if we need to pipe (%>%) it, this can be done as

library(dplyr)
data %>% 
     sapply(levels)

Or another option is summarise_each from dplyr where we specify the levels within the funs.

 data %>%
      summarise_each(funs(list(levels(.))))



回答2:


A simpler method is to use the sqldf package and use a select distinct statement. This makes it easier to automatically get the names of factor levels and then specify as levels to other columns/variables.

Generic code snippet is:

library(sqldf)
    array_name = sqldf("select DISTINCT *colname1* as '*column_title*' from *table_name*")

Sample code using iris dataset:

df1 = iris
factor1 <- sqldf("select distinct Species as 'flower_type' from df1")
factor1    ## to print the names of factors

Output:

  flower_type
1      setosa
2  versicolor
3   virginica



回答3:


If your problem is specifically to output a list of all levels for a factor, then I have found a simple solution using :

unique(df$x)

For instance, for the infamous iris dataset:

unique(iris$Species)




回答4:


Or using purrr:

data %>% purrr:map(levels)

Or to first factorize everything:

data %>% dplyr::mutate_all(as.factor) %>% purrr:map(levels)

And answering the question about how to get the lengths:

data %>% map(levels) %>% map(length)


来源:https://stackoverflow.com/questions/27676404/list-all-factor-levels-of-a-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!