问题
For each DVID
and FORM
I want to add a row for the Fed/fasted ratio into my data frame
dfin >-
DVID FORM FED median gmean CV
1 A fast 15 20 10
1 A Fed 30 40 15
1 B fast 40 60 20
1 B Fed 50 100 25
mydfout <-
DVID FORM FED median gmean CV
1 A fast 15 20 10
1 A Fed 30 40 15
1 A Fed/Fasted(%) 200 200 NA
1 B fast 40 60 20
1 B Fed 50 100 25
1 B Fed/Fasted(%) 125 166.6 NA
how can I do this in R?
回答1:
we can use base r functions to perform this:
A=aggregate(cbind(median,gmean)~DVID+FORM,dat1,function(x)x[2]/x[1]*100)
B=transform(A,FED="Fed/Fasted%",CV=NA)
do.call(rbind,Map(rbind,split(dat1,dat1[1:2]),split(B,B[1:2])))
DVID FORM FED median gmean CV
1.A.1 1 A fast 15 20.0000 10
1.A.2 1 A Fed 30 40.0000 15
1.A.3 1 A Fed/Fasted% 200 200.0000 NA
1.B.3 1 B fast 40 60.0000 20
1.B.4 1 B Fed 50 100.0000 25
1.B.2 1 B Fed/Fasted% 125 166.6667 NA
回答2:
A simple approach is to calculate all the aggregations, then row-bind them back to the original data frame. In dplyr,
library(dplyr)
df_in <- data.frame(DVID = c(1L, 1L, 1L, 1L),
FORM = c("A", "A", "B", "B"),
FED = c("fast", "Fed", "fast", "Fed"),
median = c(15L, 30L, 40L, 50L),
gmean = c(20L, 40L, 60L, 100L),
CV = c(10L, 15L, 20L, 25L),
stringsAsFactors = FALSE)
df_out <- df_in %>%
group_by(DVID, FORM) %>%
summarise_at(vars(median, gmean),
funs(.[FED == 'Fed'] / .[FED == 'fast'] * 100)) %>%
mutate(FED = 'Fed/Fasted(%)',
CV = NA) %>%
bind_rows(df_in) %>%
select(1:2, 5, 3:4, 6) %>% arrange(DVID, FORM, FED) %>% ungroup() # make it pretty
df_out
#> # A tibble: 6 x 6
#> DVID FORM FED median gmean CV
#> <int> <chr> <chr> <dbl> <dbl> <int>
#> 1 1 A fast 15.0 20.0 10
#> 2 1 A Fed 30.0 40.0 15
#> 3 1 A Fed/Fasted(%) 200 200 NA
#> 4 1 B fast 40.0 60.0 20
#> 5 1 B Fed 50.0 100 25
#> 6 1 B Fed/Fasted(%) 125 167 NA
回答3:
One way using base R is we split
the dataframe by DVID
and FORM
and for each group we calculate the median
and gmean
. Take the first value from group for DVID
and FORM
and assign NA
to CV
.
do.call(rbind,
lapply(split(dfin, list(dfin$DVID, dfin$FORM)), function(x)
rbind(x, data.frame(DVID = x[[1]][1], FORM = x[[2]][1], FED = "Fed/Fasted(%)",
median = (x[["median"]][x[["FED"]] == "Fed"]/x[["median"]][x[["FED"]] == "fast"]) * 100,
gmean = (x[["gmean"]][x[["FED"]] == "Fed"]/x[["gmean"]][x[["FED"]] == "fast"]) * 100,
CV = NA))))
#DVID FORM FED median gmean CV
# 1 A fast 15 20.0000 10
# 1 A Fed 30 40.0000 15
# 1 A Fed/Fasted(%) 200 200.0000 NA
# 1 B fast 40 60.0000 20
# 1 B Fed 50 100.0000 25
# 1 B Fed/Fasted(%) 125 166.6667 NA
来源:https://stackoverflow.com/questions/48555851/adding-a-row-for-the-ratio-of-two-variables