问题
I'm a SAS programmer trying to learn R. If SAS, I would do this to save results of descriptive stats into a dataset:
proc means data=abc;
var var1 var2 var3;
ods output summary=result1;
run;
I think in R, it would be this: summary(abc)->result1
Someone told me to do this. as.data.frame(unclass(summary(new_scales)))->new_table
But the result in this table is not very usable.
Is there away to get a better structured result like I would get from SAS PROC MEANS? I would like columns to look like: variable name, Mean, SD, min, max, etc. and columns carry results from each variable.
回答1:
Consider sapply
(hidden loop to return equal length object as input) to create a matrix of aggregation results:
# SINGLE AGGREGATE
stats_vector <- sapply(abc[c("var1", "var2", "var3")], function(x) mean(x, na.rm=TRUE)))
# MULTIPLE AGGREGATES
stats_matrix <- sapply(abc[c("var1", "var2", "var3")],
function(x) c(count=length(x), sum=sum(x), mean=mean(x), min=min(x),
q1=quantile(x)[2], median=median(x), q3=quantile(x)[4],
max=max(x), sd=sd(x)))
)
If your proc means
uses class
for grouping, then use aggregate
which returns a data frame:
# SINGLE AGGREGATE
mean_df <- aggregate(cbind(var1, var2, var3) ~ group, abc, function(x) mean(x, na.rm=TRUE)))
# MULTIPLE AGGREGATES
agg_raw <- aggregate(cbind(var1, var2, var3) ~ group, abc,
function(x) c(count=length(x), sum=sum(x), mean=mean(x), min=min(x),
q1=quantile(x)[2], median=median(x), q3=quantile(x)[4],
max=max(x), sd=sd(x)))
)
agg_df <- do.call(data.frame, agg_raw)
Rextester demo
回答2:
Consider the tidyverse approach. The idea is to pass the data into an equation like linear regression, then map the model result to model values & finally storing the summary into a data frame.
library(tidyverse)
library(broom)
summary_result<-mtcars %>%
nest(-carb) %>%
mutate(model = purrr::map(data, function(x) {
lm(gear ~ mpg+cyl, data = x)}),
values = purrr::map(model, glance),
r.squared = purrr::map_dbl(values, "r.squared"),
pvalue = purrr::map_dbl(values, "p.value")) %>%
select(-data, -model, -values)
summary_result
carb r.squared pvalue
1 4 0.4352 0.135445
2 1 0.7011 0.089325
3 2 0.8060 0.003218
4 3 0.5017 0.498921
5 6 0.0000 NA
6 8 0.0000 NA
来源:https://stackoverflow.com/questions/57024367/r-question-how-to-save-summary-results-into-a-dataset