问题
I need to convert many columns that are numeric to factor type. An example table:
df <- data.frame(A=1:10, B=2:11, C=3:12)
I tried with apply:
cols<-c('A', 'B')
df[,cols]<-apply(df[,cols], 2, function(x){ as.factor(x)});
But the result is a character class.
> class(df$A)
[1] "character"
How can I do this without doing as.factor for each column?
回答1:
Try
df[,cols] <- lapply(df[,cols],as.factor)
The problem is that apply() tries to bind the results into a matrix, which results in coercing the columns to character:
class(apply(df[,cols], 2, as.factor)) ## matrix
class(as.factor(df[,1])) ## factor
In contrast, lapply() operates on elements of lists.
回答2:
Updated Nov 9, 2017
purrr / purrrlyr are still in development
Similar to Ben's, but using purrrlyr::dmap_at:
library(purrrlyr)
df <- data.frame(A=1:10, B=2:11, C=3:12)
# selected cols to factor
cols <- c('A', 'B')
(dmap_at(df, factor, .at = cols))
A B C
<fctr> <fctr> <int>
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 10
9 10 11
10 11 12
回答3:
You can place your results back into a data frame which will recognize the factors:
df[,cols]<-data.frame(apply(df[,cols], 2, function(x){ as.factor(x)}))
回答4:
Another option, with purrr and dplyr, perhaps a little more readable than the base solutions, and keeps the data in a dataframe:
Here's the data:
df <- data.frame(A=1:10, B=2:11, C=3:12)
str(df)
'data.frame': 10 obs. of 3 variables:
$ A: int 1 2 3 4 5 6 7 8 9 10
$ B: int 2 3 4 5 6 7 8 9 10 11
$ C: int 3 4 5 6 7 8 9 10 11 12
We can easily operate on all columns with dmap:
library(purrr)
library(dplyr)
# all cols to factor
dmap(df, as.factor)
Source: local data frame [10 x 3]
A B C
(fctr) (fctr) (fctr)
1 1 2 3
2 2 3 4
3 3 4 5
4 4 5 6
5 5 6 7
6 6 7 8
7 7 8 9
8 8 9 10
9 9 10 11
10 10 11 12
And similarly use dmap on a subset of columns using select from dplyr:
# selected cols to factor
cols <- c('A', 'B')
df[,cols] <-
df %>%
select(one_of(cols)) %>%
dmap(as.factor)
To get the desired result:
str(df)
'data.frame': 10 obs. of 3 variables:
$ A: Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10
$ B: Factor w/ 10 levels "2","3","4","5",..: 1 2 3 4 5 6 7 8 9 10
$ C: int 3 4 5 6 7 8 9 10 11 12
回答5:
A simple but effective option would be mapply
df <- data.frame(A=1:10, B=2:11, C=3:12)
cols <- c('A', 'B')
df[,cols] <- as.data.frame(mapply(as.factor,df[,cols]))
You can also use for-loop to achieve the same result:
for(col in cols){
df[,col] <- as.factor(df[,col])
}
来源:https://stackoverflow.com/questions/34124444/r-apply-convert-many-columns-from-numeric-to-factor