Obtaining absolute deviation from mean for two sets of scores

∥☆過路亽.° 提交于 2020-07-03 05:09:04

问题


To obtain absolute deviation from the mean for two groups of scores, I usually need to write long codes in R such as the ones shown below.

Question

I was wondering if it might be possible in BASE R to somehow Vectorize the mad() function so that the absolute deviation from the mean scores for each group of scores in the example I'm showing below could be obtained using that Vectorized version of mad()? Any other workable ideas are highly appreciated?

set.seed(0)
     y = as.vector(unlist(mapply(FUN = rnorm, n = c(10, 10)))) # Produces two sets of scores
groups = factor( rep(1:2, times = c(10, 10) ) )                # Grouping ID variable

G1 = y[groups == 1]              # subset y scores for group 1
G2 = y[groups == 2]              # subset y scores for group 2
G1.abs.dev = abs(G1 - mean(G1))  # absolute deviation from mean scores for group 1
G2.abs.dev = abs(G2 - mean(G2))  # absolute deviation from mean scores for group 2

回答1:


How about

score <- lapply(split(y, groups), FUN = function (u) abs(u - mean(u)))

or

score <- ave(y, groups, FUN = function (u) abs(u - mean(u)))

The results are organized in a different way. Choose the one that is most comfortable to you.


There is something wrong with your wording. mad returns a single statistic / value for data. For example,

sapply(split(y, groups), mad)

You are not vectorizing mad, but simply computing the deviation for each datum as your example code shows.




回答2:


If you stick everything in a data.frame, it's much cleaner. In base R,

set.seed(0)

df <- data.frame(y = rnorm(20),
                 group = rep(1:2, each = 10))

df$abs_dev <- with(df, ave(y, group, FUN = function(x){abs(mean(x) - x)}))

df
#>               y  group    abs_dev
#> 1   1.262954285      1 0.90403032
#> 2  -0.326233361      1 0.68515732
#> 3   1.329799263      1 0.97087530
#> 4   1.272429321      1 0.91350536
#> 5   0.414641434      1 0.05571747
#> 6  -1.539950042      1 1.89887401
#> 7  -0.928567035      1 1.28749100
#> 8  -0.294720447      1 0.65364441
#> 9  -0.005767173      1 0.36469114
#> 10  2.404653389      1 2.04572943
#> 11  0.763593461      2 1.12607477
#> 12 -0.799009249      2 0.43652794
#> 13 -1.147657009      2 0.78517570
#> 14 -0.289461574      2 0.07301974
#> 15 -0.299215118      2 0.06326619
#> 16 -0.411510833      2 0.04902952
#> 17  0.252223448      2 0.61470476
#> 18 -0.891921127      2 0.52943981
#> 19  0.435683299      2 0.79816461
#> 20 -1.237538422      2 0.87505711

or dplyr,

library(dplyr)
set.seed(0)

df <- data_frame(y = rnorm(20),
                 group = rep(1:2, each = 10))

df <- df %>% group_by(group) %>% mutate(abs_dev = abs(mean(y) - y))

df
#> # A tibble: 20 x 3
#> # Groups:   group [2]
#>               y  group    abs_dev
#>           <dbl>  <int>      <dbl>
#>  1  1.262954285      1 0.90403032
#>  2 -0.326233361      1 0.68515732
#>  3  1.329799263      1 0.97087530
#>  4  1.272429321      1 0.91350536
#>  5  0.414641434      1 0.05571747
#>  6 -1.539950042      1 1.89887401
#>  7 -0.928567035      1 1.28749100
#>  8 -0.294720447      1 0.65364441
#>  9 -0.005767173      1 0.36469114
#> 10  2.404653389      1 2.04572943
#> 11  0.763593461      2 1.12607477
#> 12 -0.799009249      2 0.43652794
#> 13 -1.147657009      2 0.78517570
#> 14 -0.289461574      2 0.07301974
#> 15 -0.299215118      2 0.06326619
#> 16 -0.411510833      2 0.04902952
#> 17  0.252223448      2 0.61470476
#> 18 -0.891921127      2 0.52943981
#> 19  0.435683299      2 0.79816461
#> 20 -1.237538422      2 0.87505711

or data.table:

library(data.table)
set.seed(0)

dt <- data.table(y = rnorm(20),
                 group = rep(1:2, each = 10))

dt[, abs_dev := abs(mean(y) - y), by = group][]
#>                y group    abs_dev
#>  1:  1.262954285     1 0.90403032
#>  2: -0.326233361     1 0.68515732
#>  3:  1.329799263     1 0.97087530
#>  4:  1.272429321     1 0.91350536
#>  5:  0.414641434     1 0.05571747
#>  6: -1.539950042     1 1.89887401
#>  7: -0.928567035     1 1.28749100
#>  8: -0.294720447     1 0.65364441
#>  9: -0.005767173     1 0.36469114
#> 10:  2.404653389     1 2.04572943
#> 11:  0.763593461     2 1.12607477
#> 12: -0.799009249     2 0.43652794
#> 13: -1.147657009     2 0.78517570
#> 14: -0.289461574     2 0.07301974
#> 15: -0.299215118     2 0.06326619
#> 16: -0.411510833     2 0.04902952
#> 17:  0.252223448     2 0.61470476
#> 18: -0.891921127     2 0.52943981
#> 19:  0.435683299     2 0.79816461
#> 20: -1.237538422     2 0.87505711


来源:https://stackoverflow.com/questions/44738753/obtaining-absolute-deviation-from-mean-for-two-sets-of-scores

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!