summarization

Return most frequent string value for each group [duplicate]

旧巷老猫 提交于 2019-11-30 01:49:39
This question already has an answer here: How to select the row with the maximum value in each group 10 answers How to select the rows with maximum values in each group with dplyr? [duplicate] 6 answers a <- c(rep(1:2,3)) b <- c("A","A","B","B","B","B") df <- data.frame(a,b) > str(b) chr [1:6] "A" "A" "B" "B" "B" "B" a b 1 1 A 2 2 A 3 1 B 4 2 B 5 1 B 6 2 B I want to group by variable a and return the most frequent value of b My desired result would look like a b 1 1 B 2 2 B In dplyr it would be something like df %>% group_by(a) %>% summarize (b = most.frequent(b)) I mentioned dplyr only to

Summarise over all columns

大憨熊 提交于 2019-11-29 06:24:18
I have data of the following format: gen = function () sample.int(10, replace = TRUE) x = data.frame(A = gen(), C = gen(), G = gen(), T = gen()) I would now like to attach, to each row, the total sum of all the elements in the row (my actual function is more complex but sum illustrates the problem). Without dplyr, I’d write cbind(x, Sum = apply(x, 1, sum)) Resulting in: A C G T Sum 1 3 1 6 9 19 2 3 4 3 3 13 3 3 1 10 5 19 4 7 2 1 6 16 … But it seems surprisingly hard to do this with dplyr. I’ve tried x %>% rowwise() %>% mutate(Sum = sum(A : T)) But the result is not the sum of the columns of

How to count occurrences combinations in data.table in R

痞子三分冷 提交于 2019-11-29 02:36:21
I have two data.tables. I would like to count the number of rows matching a combination of a table in another table. I have checked the data.table documentation but I have not found my answer. I am using data.table 1.9.2. DT1 <- data.table(a=c(3,2), b=c(8,3)) DT2 <- data.table(w=c(3,3,3,2,3), x=c(8,8,8,3,7), z=c(2,6,7,2,2)) DT1 # a b # 1: 3 8 # 2: 2 3 DT2 # w x z # 1: 3 8 2 # 2: 3 8 6 # 3: 3 8 7 # 4: 2 3 2 # 5: 3 7 2 Now I would like to count the number of (3, 8) pairs and (2, 3) pairs in DT2. setkey(DT2, w, x) nrow(DT2[J(3, 8), nomatch=0]) # [1] 3 ## OK ! nrow(DT2[J(2, 3), nomatch=0]) # [1] 1

Elasticsearch: Possible to process aggregation results?

泪湿孤枕 提交于 2019-11-28 08:50:29
问题 I calculate the duration of my service-processes using the SUM-Aggregation. Each step of the executed process will be saved in Elasticsearch under a calling Id. This is what I monitor: Duration of Request-Processing for ID #123 (calling service #1) Duration of Server-Response for ID #123 (calling service #1) **Complete Duration for ID #123** Duration of Request-Processing for ID #124 (calling service #1) Duration of Server-Response for ID #124 (calling service #1) **Complete duration for ID

How to count occurrences combinations in data.table in R

主宰稳场 提交于 2019-11-27 16:55:29
问题 I have two data.tables. I would like to count the number of rows matching a combination of a table in another table. I have checked the data.table documentation but I have not found my answer. I am using data.table 1.9.2. DT1 <- data.table(a=c(3,2), b=c(8,3)) DT2 <- data.table(w=c(3,3,3,2,3), x=c(8,8,8,3,7), z=c(2,6,7,2,2)) DT1 # a b # 1: 3 8 # 2: 2 3 DT2 # w x z # 1: 3 8 2 # 2: 3 8 6 # 3: 3 8 7 # 4: 2 3 2 # 5: 3 7 2 Now I would like to count the number of (3, 8) pairs and (2, 3) pairs in DT2