Why the built-in lm function is so slow in R?

后端未结

关注

 2  1572

栀梦 2020-12-01 16:38

I always thought that the lm function was extremely fast in R, but as this example would suggest, the closed solution computed using the solve func

2条回答

栀梦 (楼主)

2020-12-01 17:27
You are overlooking that
- solve() only returns your parameters
- lm() returns you a (very rich) object with many components for subsequent analysis, inference, plots, ...
- the main cost of your lm() call is not the projection but the resolution of the formula y ~ . from which the model matrix needs to be built.
To illustrate Rcpp we wrote a few variants of a function fastLm() doing more of what lm() does (ie a bit more than lm.fit() from base R) and measured it. See e.g. this benchmark script which clearly shows that the dominant cost for smaller data sets is in parsing the formula and building the model matrix.

In short, you are doing the Right Thing by using benchmarking but you are doing it not all that correctly in trying to compare what is mostly incomparable: a subset with a much larger task.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...