Adding a row for the ratio of two variables

一世执手 提交于 2019-12-23 02:52:28

问题


For each DVID and FORM I want to add a row for the Fed/fasted ratio into my data frame

dfin >- 
DVID   FORM   FED    median   gmean    CV
 1      A     fast    15       20      10
 1      A     Fed     30       40      15 
 1      B     fast    40       60      20
 1      B     Fed     50       100     25

mydfout <- 
DVID   FORM   FED          median   gmean     CV
 1      A     fast           15       20      10
 1      A     Fed            30       40      15
 1      A     Fed/Fasted(%)  200      200     NA
 1      B     fast           40       60      20
 1      B     Fed            50       100     25
 1      B     Fed/Fasted(%)  125      166.6   NA

how can I do this in R?


回答1:


we can use base r functions to perform this:

A=aggregate(cbind(median,gmean)~DVID+FORM,dat1,function(x)x[2]/x[1]*100)
B=transform(A,FED="Fed/Fasted%",CV=NA)
do.call(rbind,Map(rbind,split(dat1,dat1[1:2]),split(B,B[1:2])))
      DVID FORM         FED median    gmean CV
1.A.1    1    A        fast     15  20.0000 10
1.A.2    1    A         Fed     30  40.0000 15
1.A.3    1    A Fed/Fasted%    200 200.0000 NA
1.B.3    1    B        fast     40  60.0000 20
1.B.4    1    B         Fed     50 100.0000 25
1.B.2    1    B Fed/Fasted%    125 166.6667 NA



回答2:


A simple approach is to calculate all the aggregations, then row-bind them back to the original data frame. In dplyr,

library(dplyr)

df_in <- data.frame(DVID = c(1L, 1L, 1L, 1L), 
                 FORM = c("A", "A", "B", "B"), 
                 FED = c("fast", "Fed", "fast", "Fed"), 
                 median = c(15L, 30L, 40L, 50L), 
                 gmean = c(20L, 40L, 60L, 100L), 
                 CV = c(10L, 15L, 20L, 25L),
                 stringsAsFactors = FALSE)

df_out <- df_in %>% 
    group_by(DVID, FORM) %>% 
    summarise_at(vars(median, gmean), 
                 funs(.[FED == 'Fed'] / .[FED == 'fast'] * 100)) %>% 
    mutate(FED = 'Fed/Fasted(%)', 
           CV = NA) %>% 
    bind_rows(df_in) %>% 
    select(1:2, 5, 3:4, 6) %>% arrange(DVID, FORM, FED) %>% ungroup()    # make it pretty

df_out
#> # A tibble: 6 x 6
#>    DVID FORM  FED           median gmean    CV
#>   <int> <chr> <chr>          <dbl> <dbl> <int>
#> 1     1 A     fast            15.0  20.0    10
#> 2     1 A     Fed             30.0  40.0    15
#> 3     1 A     Fed/Fasted(%)  200   200      NA
#> 4     1 B     fast            40.0  60.0    20
#> 5     1 B     Fed             50.0 100      25
#> 6     1 B     Fed/Fasted(%)  125   167      NA



回答3:


One way using base R is we split the dataframe by DVID and FORM and for each group we calculate the median and gmean. Take the first value from group for DVID and FORM and assign NA to CV.

do.call(rbind, 
    lapply(split(dfin, list(dfin$DVID, dfin$FORM)), function(x) 
    rbind(x, data.frame(DVID = x[[1]][1], FORM = x[[2]][1], FED = "Fed/Fasted(%)",
    median = (x[["median"]][x[["FED"]] == "Fed"]/x[["median"]][x[["FED"]] == "fast"]) * 100, 
    gmean = (x[["gmean"]][x[["FED"]] == "Fed"]/x[["gmean"]][x[["FED"]] == "fast"]) * 100, 
    CV = NA))))



#DVID FORM           FED median    gmean CV
#   1    A          fast     15  20.0000 10
#   1    A           Fed     30  40.0000 15
#   1    A Fed/Fasted(%)    200 200.0000 NA
#   1    B          fast     40  60.0000 20
#   1    B           Fed     50 100.0000 25
#   1    B Fed/Fasted(%)    125 166.6667 NA


来源:https://stackoverflow.com/questions/48555851/adding-a-row-for-the-ratio-of-two-variables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!