R: calculating margins or row & col sums for a data frame

狂风中的少年 提交于 2019-12-22 17:01:26

问题


I have a data frame that looks like this:

         Flag1             Flag2    Type1 Type2  Type3
1        A                 FIRST      2    0       0
2        A                SECOND      1    9       0
3        A                 THIRD      3    7       0
4        A                FOURTH      9   18       0
5        A                 FIFTH      1   22       0
6        A                 SIXTH      1   13       0
7        B                 FIRST      0    0       0
8        B                SECOND      3    9       0
9        B                 THIRD      5   85       0
10       B                FOURTH      4   96       0
11       B                 FIFTH      3   40       0
12       B                 SIXTH      0   17       0

I need to sum in such a way that my data frame finally looks like this

         Flag1             Flag2    Type1 Type2  Type3   Sum
1        A                 FIRST      2    0       0      2
2        A                SECOND      1    9       0     10 
3        A                 THIRD      3    7       0     10
4        A                FOURTH      9   18       0     27
5        A                 FIFTH      1   22       0     23
6        A                 SIXTH      1   13       0     14
7        B                 FIRST      0    0       0      0
8        B                SECOND      3    9       0     12
9        B                 THIRD      5   85       0     90
10       B                FOURTH      4   96       0    100
11       B                 FIFTH      3   40       0     43
12       B                 SIXTH      0   17       0     17 
13      (all)              FIRST      2    0       0      2
14      (all)             SECOND      4   18       0     22
15      (all)              THIRD      8   92       0    100
16      (all)             FOURTH     13  114       0    127
17      (all)              FIFTH      4   62       0     66
18      (all)              SIXTH      1   30       0     31
19       A                 (all)     17   68       0     86
20       B                 (all)     15  247       0    262
21      (all)              (all)     32  315       0    348

I have tried the add_margins function in the reshape2 package, no use, it doesn't calculate the sums like I want it to. I have tried aggregate, rowSums & colSums - no result.

Any help here would be great.

Thanks

The summing function needs to add the previous Flag2's sum too. Like,

        Flag1             Flag2    Type1 Type2  Type3   Sum
1        A                 FIRST      2    0       0      2
2        A                SECOND      1    9       0     12 
3        A                 THIRD      3    7       0     22
4        A                FOURTH      9   18       0     49
5        A                 FIFTH      1   22       0     72
6        A                 SIXTH      1   13       0     86
7        B                 FIRST      0    0       0      0
8        B                SECOND      3    9       0     12
9        B                 THIRD      5   85       0    102
10       B                FOURTH      4   96       0    202
11       B                 FIFTH      3   40       0    245
12       B                 SIXTH      0   17       0    262 
13      (all)              FIRST      2    0       0      2
14      (all)             SECOND      4   18       0     24
15      (all)              THIRD      8   92       0    124
16      (all)             FOURTH     13  114       0    251
17      (all)              FIFTH      4   62       0    317
18      (all)              SIXTH      1   30       0    348
19       A                 (all)     17   68       0     85
20       B                 (all)     15  247       0    262
21      (all)              (all)     32  315       0    347

回答1:


Assume you have such a data,frame and its name is dtable:

dt1 <- as.data.frame(addmargins(xtabs(Type1~Flag1+Flag2, data=dtable)))
dt2 <- as.data.frame(addmargins(xtabs(Type2~Flag1+Flag2, data=dtable)))
dt3 <- as.data.frame(addmargins(xtabs(Type3~Flag1+Flag2, data=dtable)))
names(dt1)[3] <- "Type1"
names(dt2)[3] <- "Type2"
names(dt3)[3] <- "Type3"

dt.all <- merge(merge(dt1,dt2), dt3)
dt.all$Sum <- with(dt.all, Type1+Type2+Type3)

I wasn't able to get the exact sort order that you wanted but this is close:

levels(dt.all$Flag2) <-  c("FIRST", "SECOND", "THIRD", "FOURTH" ,"FIFTH", "SIXTH",  "Sum" ) 
dt.all[order(dt.all$Flag1, dt.all$Flag2), ]

   Flag1  Flag2 Type1 Type2 Type3 Sum
1      A  FIRST     1    22     0  23
2      A SECOND     2     0     0   2
3      A  THIRD     9    18     0  27
4      A FOURTH     1     9     0  10
5      A  FIFTH     1    13     0  14
7      A  SIXTH     3     7     0  10
6      A    Sum    17    69     0  86
8      B  FIRST     3    40     0  43
9      B SECOND     0     0     0   0
10     B  THIRD     4    96     0 100
11     B FOURTH     3     9     0  12
12     B  FIFTH     0    17     0  17
14     B  SIXTH     5    85     0  90
13     B    Sum    15   247     0 262
15   Sum  FIRST     4    62     0  66
16   Sum SECOND     2     0     0   2
17   Sum  THIRD    13   114     0 127
18   Sum FOURTH     4    18     0  22
19   Sum  FIFTH     1    30     0  31
21   Sum  SIXTH     8    92     0 100
20   Sum    Sum    32   316     0 348



回答2:


rowSums works for me (or am I missing something?).

> my.df <- read.table(textConnection("         Flag1             Flag2    Type1 Type2  Type3
+ 1        A                 FIRST      2    0       0
+ 2        A                SECOND      1    9       0
+ 3        A                 THIRD      3    7       0
+ 4        A                FOURTH      9   18       0
+ 5        A                 FIFTH      1   22       0
+ 6        A                 SIXTH      1   13       0
+ 7        B                 FIRST      0    0       0
+ 8        B                SECOND      3    9       0
+ 9        B                 THIRD      5   85       0
+ 10       B                FOURTH      4   96       0
+ 11       B                 FIFTH      3   40       0
+ 12       B                 SIXTH      0   17       0
+ "))
Browse[2]> my.df
   Flag1  Flag2 Type1 Type2 Type3
1      A  FIRST     2     0     0
2      A SECOND     1     9     0
3      A  THIRD     3     7     0
4      A FOURTH     9    18     0
5      A  FIFTH     1    22     0
6      A  SIXTH     1    13     0
7      B  FIRST     0     0     0
8      B SECOND     3     9     0
9      B  THIRD     5    85     0
10     B FOURTH     4    96     0
11     B  FIFTH     3    40     0
12     B  SIXTH     0    17     0
Browse[2]> rowSums(my.df[3:5])
  1   2   3   4   5   6   7   8   9  10  11  12 
  2  10  10  27  23  14   0  12  90 100  43  17 
Browse[2]> my.df$Sum <- rowSums(my.df[3:5])
Browse[2]> my.df
   Flag1  Flag2 Type1 Type2 Type3 Sum
1      A  FIRST     2     0     0   2
2      A SECOND     1     9     0  10
3      A  THIRD     3     7     0  10
4      A FOURTH     9    18     0  27
5      A  FIFTH     1    22     0  23
6      A  SIXTH     1    13     0  14
7      B  FIRST     0     0     0   0
8      B SECOND     3     9     0  12
9      B  THIRD     5    85     0  90
10     B FOURTH     4    96     0 100
11     B  FIFTH     3    40     0  43
12     B  SIXTH     0    17     0  17


来源:https://stackoverflow.com/questions/5863456/r-calculating-margins-or-row-col-sums-for-a-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!