问题
df
var1 var2
1 a 1
2 b 2
3 a 3
4 c 6
5 d 88
6 b 0
df2 <- data.frame(var1=c("k","b","a","k","k","b"),var2=c(14,78,5,6,88,0))
> list <- list(df,df2)
for(i in list){
if(any(i[ ,1] == i[ ,1})){
cumsum(.)
}
}
I have a list containing of data.frames. I want to iterate over these data.frames. When there is the same letter in the first column, then the sum should be calculated. I want this new row to be in my data.frame.
I completely messed up the if statement
. Can somebody help me please?
EDIT: the result should look like
df
var1 var2
1 a 4
2 b 2
3 c 6
4 d 88
and for df2
var1 var2
1 k 108
2 b 78
3 a 5
In my real problem, the list consists of 10 data.frames, not just two
回答1:
in Base-R
sapply(split(df$var2,df$var1), sum)
a b c d
4 2 6 88
or to do it on each element of a list of dataframes
lapply(list, function(x) sapply(split(x$var2,x$var1), sum))
[[1]]
a b c d
4 2 6 88
[[2]]
a b k
5 78 108
回答2:
Something like?
df <- tibble::tribble(
~var1, ~var2,
"a", 1,
"b", 2,
"a", 3,
"c", 6,
"d", 88,
"b", 0)
df2 <- data.frame(var1=c("k","b","a","k","k","b"),var2=c(14,78,5,6,88,0))
df3 <- cbind(df,df2)
colnames(df3) <- c("df1var1","df1var2","df2var1","df2var2")
df3 %>% mutate(sum = ifelse(df1var1 == df2var1, df1var2 + df2var2, NA))
df1var1 df1var2 df2var1 df2var2 sum
1 a 1 k 14 NA
2 b 2 b 78 80
3 a 3 a 5 8
4 c 6 k 6 NA
5 d 88 k 88 NA
6 b 0 b 0 0
回答3:
It was a bit difficult to understand, but after you gave the result as it should like I think this is what you are looking for: group the df and then summarise
library(tidyverse)
df2 <- data.frame(var1=c("k","b","a","k","k","b"),var2=c(14,78,5,6,88,0))
df <- tibble::tribble(
~var1, ~var2,
"a", 1,
"b", 2,
"a", 3,
"c", 6,
"d", 88,
"b", 0
)
df %>%
group_by(var1) %>%
summarise(sum = sum(var2))
#> # A tibble: 4 x 2
#> var1 sum
#> <chr> <dbl>
#> 1 a 4
#> 2 b 2
#> 3 c 6
#> 4 d 88
df2 %>%
group_by(var1) %>%
summarise(sum = sum(var2))
#> # A tibble: 3 x 2
#> var1 sum
#> <chr> <dbl>
#> 1 a 5
#> 2 b 78
#> 3 k 108
Created on 2020-06-10 by the reprex package (v0.3.0)
and in base R you could do
aggregate(df$var2, by=list(df$var1), FUN=sum)[2]
It took me a while to understand that you want to do this starting with a list of data frames. In this case you could define your function in the tidyverse-way and apply purrr:map
dflist <- list(df, df2)
df_sum <- function(df){
df %>%
as.data.frame() %>%
group_by(var1) %>%
summarise(sum = sum(var2))
}
purr::map(dflist,tt)
来源:https://stackoverflow.com/questions/62305258/sum-of-rows-when-condition-is-met-data-frame-in-r