I have 10 linear models where I only need some information, namely: r-squared, p-value, coefficients of slope and intercept. I managed to extract these values (via ridiculously
Consider building a matrix of lm
results. First create a defined function to handle your generalized data frame model build with results extraction. Then, call by
which can subset your data frame by a factor column and pass each subset into defined method. Finally, rbind
all grouped matrices together for a singular output
lm_results <- function(df) {
model <- lm(Qend ~ Rainfall, data = df)
res <- summary(model)
p <- res$fstatistic
c(gradient = res$coefficients[2,1],
intercept = res$coefficients[2,2],
r_sq = res$r.squared,
adj_r_sq = res$adj.r.squared,
f_stat = p[['value']],
p_value = unname(pf(p[1], p[2], p[3], lower.tail=FALSE))
)
}
matrix_list <- by(d, d$group, lm_results)
final_matrix <- do.call(rbind, matrix_list)
To demonstrate on random, seeded data
set.seed(12262018)
data_tools <- c("sas", "stata", "spss", "python", "r", "julia")
d <- data.frame(
group = sample(data_tools, 500, replace=TRUE),
int = sample(1:15, 500, replace=TRUE),
Qend = rnorm(500) / 100,
Rainfall = rnorm(500) * 10
)
Results
mat_list <- by(d, d$group, lm_results)
final_matrix <- do.call(rbind, mat_list)
final_matrix
# gradient intercept r_sq adj_r_sq f_stat p_value
# julia -1.407313e-04 1.203832e-04 0.017219149 0.004619395 1.3666258 0.24595273
# python -1.438116e-04 1.125170e-04 0.018641512 0.007230367 1.6336233 0.20464162
# r 2.031717e-04 1.168037e-04 0.041432175 0.027738349 3.0256098 0.08635510
# sas -1.549510e-04 9.067337e-05 0.032476668 0.021355710 2.9203121 0.09103619
# spss 9.326656e-05 1.068516e-04 0.008583473 -0.002682623 0.7618853 0.38511469
# stata -7.079514e-05 1.024010e-04 0.006013841 -0.006568262 0.4779679 0.49137093