I\'ve been having this strange problem with apply
lately. Consider the following example:
set.seed(42)
df <- data.frame(cars, foo = sample(LE
apply
works on a matrix, and a matrix must be of all one type. So df
is being transformed into a matrix, and since it contains a character, all the columns are becoming character.
> apply(df, 2, class)
speed dist foo
"character" "character" "character"
To get what you want, check out the colwise
and numcolwise
functions in plyr
.
> numcolwise(mean)(df)
speed dist
1 15.4 42.98
You are applying a function over the columns of a data.frame. Since a data.frame is a list, you can use lapply
or sapply
instead of apply
:
sapply(df, mean)
speed dist foo
15.40 42.98 NA
Warning message:
In mean.default(X[[3L]], ...) :
argument is not numeric or logical: returning NA
And you can remove the warning message by using an anonymous function that tests for class numeric before calculating the mean:
sapply(df, function(x)ifelse(is.numeric(x), mean(x), NA))
speed dist foo
15.40 42.98 NA
The first sentence of the description for ?apply
says:
If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.
Matrices can only be of a single type in R. When the data frame is coerced to a matrix, everything ends up as a character if there is even a single character column.
I guess I owe you an description of an alternative, so here you go. data frames are really just lists, so if you want to apply a function to each column, use lapply
or sapply
instead.