R question: how to save summary results into a dataset

牧云@^-^@ 提交于 2019-12-14 04:13:10

问题


I'm a SAS programmer trying to learn R. If SAS, I would do this to save results of descriptive stats into a dataset:

proc means data=abc;
var var1 var2 var3;
ods output summary=result1;
run;

I think in R, it would be this: summary(abc)->result1

Someone told me to do this. as.data.frame(unclass(summary(new_scales)))->new_table

But the result in this table is not very usable.

Is there away to get a better structured result like I would get from SAS PROC MEANS? I would like columns to look like: variable name, Mean, SD, min, max, etc. and columns carry results from each variable.


回答1:


Consider sapply (hidden loop to return equal length object as input) to create a matrix of aggregation results:

# SINGLE AGGREGATE
stats_vector <- sapply(abc[c("var1", "var2", "var3")], function(x) mean(x, na.rm=TRUE)))

# MULTIPLE AGGREGATES
stats_matrix <- sapply(abc[c("var1", "var2", "var3")], 
    function(x) c(count=length(x), sum=sum(x), mean=mean(x), min=min(x), 
                  q1=quantile(x)[2], median=median(x), q3=quantile(x)[4], 
                  max=max(x), sd=sd(x)))
)

If your proc means uses class for grouping, then use aggregate which returns a data frame:

# SINGLE AGGREGATE
mean_df <- aggregate(cbind(var1, var2, var3) ~ group, abc, function(x) mean(x, na.rm=TRUE)))

# MULTIPLE AGGREGATES
agg_raw <- aggregate(cbind(var1, var2, var3) ~ group, abc, 
    function(x) c(count=length(x), sum=sum(x), mean=mean(x), min=min(x), 
                  q1=quantile(x)[2], median=median(x), q3=quantile(x)[4], 
                  max=max(x), sd=sd(x)))
)

agg_df <- do.call(data.frame, agg_raw)

Rextester demo




回答2:


Consider the tidyverse approach. The idea is to pass the data into an equation like linear regression, then map the model result to model values & finally storing the summary into a data frame.

library(tidyverse)
library(broom)
summary_result<-mtcars %>%
  nest(-carb) %>%
  mutate(model = purrr::map(data, function(x) {
    lm(gear ~ mpg+cyl, data = x)}),
    values = purrr::map(model, glance),
    r.squared = purrr::map_dbl(values, "r.squared"),
    pvalue = purrr::map_dbl(values, "p.value")) %>%
  select(-data, -model, -values)

summary_result

  carb r.squared   pvalue
1    4    0.4352 0.135445
2    1    0.7011 0.089325
3    2    0.8060 0.003218
4    3    0.5017 0.498921
5    6    0.0000       NA
6    8    0.0000       NA


来源:https://stackoverflow.com/questions/57024367/r-question-how-to-save-summary-results-into-a-dataset

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!