Should I use a data.frame or a matrix?

前端未结

关注

 6  1585

我在风中等你 2020-11-28 00:59

When should one use a data.frame, and when is it better to use a matrix?

Both keep data in a rectangular format, so sometimes it\'s unclear

6条回答

刺人心 (楼主)

2020-11-28 01:27
Something not mentioned by @Michal is that not only is a matrix smaller than the equivalent data frame, using matrices can make your code far more efficient than using data frames, often considerably so. That is one reason why internally, a lot of R functions will coerce to matrices data that are in data frames.

Data frames are often far more convenient; one doesn't always have solely atomic chunks of data lying around.

Note that you can have a character matrix; you don't just have to have numeric data to build a matrix in R.

In converting a data frame to a matrix, note that there is a data.matrix() function, which handles factors appropriately by converting them to numeric values based on the internal levels. Coercing via as.matrix() will result in a character matrix if any of the factor labels is non-numeric. Compare:
```
> head(as.matrix(data.frame(a = factor(letters), B = factor(LETTERS))))
     a   B  
[1,] "a" "A"
[2,] "b" "B"
[3,] "c" "C"
[4,] "d" "D"
[5,] "e" "E"
[6,] "f" "F"
> head(data.matrix(data.frame(a = factor(letters), B = factor(LETTERS))))
     a B
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4 4
[5,] 5 5
[6,] 6 6
```
I nearly always use a data frame for my data analysis tasks as I often have more than just numeric variables. When I code functions for packages, I almost always coerce to matrix and then format the results back out as a data frame. This is because data frames are convenient.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...