group-summaries

Get group mean with multiple grouping variables and excluding own group value

不问归期 提交于 2020-03-23 02:05:28
问题 I'm looking for a faster way to calculate a group mean with multiple grouping variables while excluding own group values. A thought experiment would be finding average value (e.g. price) for a county from the counties in the same state in the same year excluding own county's value. Here's a toy data set. df <- data_frame( state = rep(c("AL", "CA"), each = 6), county = rep(letters[1:6], each = 2), year = rep(c(2011:2012), 6), value = sample.int(100, 12) ) df %>% group_by(state, county, year) %

Summarize data at different aggregate levels - R and tidyverse

烈酒焚心 提交于 2020-01-24 03:21:06
问题 I'm creating a bunch of basic status reports and one of things I'm finding tedious is adding a total row to all my tables. I'm currently using the Tidyverse approach and this is an example of my current code. What I'm looking for is an option to have a few different levels included by default. #load into RStudio viewer (not required) iris = iris #summary at the group level summary_grouped = iris %>% group_by(Species) %>% summarize(mean_s_length = mean(Sepal.Length), max_s_width = max(Sepal

Summarize with dplyr “other then” groups

痞子三分冷 提交于 2020-01-04 09:06:57
问题 I need to summarize in a grouped data_frame (warn: a solution with dplyr is very much appreciated but isn't mandatory) both something on each group (simple) and the same something on "other" groups. minimal example if(!require(pacman)) install.packages(pacman) pacman::p_load(dplyr) df <- data_frame( group = c('a', 'a', 'b', 'b', 'c', 'c'), value = c(1, 2, 3, 4, 5, 6) ) res <- df %>% group_by(group) %>% summarize( median = median(value) # median_other = ... ??? ... # I need the median of all

Summarize with dplyr “other then” groups

六眼飞鱼酱① 提交于 2020-01-04 09:06:10
问题 I need to summarize in a grouped data_frame (warn: a solution with dplyr is very much appreciated but isn't mandatory) both something on each group (simple) and the same something on "other" groups. minimal example if(!require(pacman)) install.packages(pacman) pacman::p_load(dplyr) df <- data_frame( group = c('a', 'a', 'b', 'b', 'c', 'c'), value = c(1, 2, 3, 4, 5, 6) ) res <- df %>% group_by(group) %>% summarize( median = median(value) # median_other = ... ??? ... # I need the median of all

Summarize with dplyr “other then” groups

我的未来我决定 提交于 2020-01-04 09:06:10
问题 I need to summarize in a grouped data_frame (warn: a solution with dplyr is very much appreciated but isn't mandatory) both something on each group (simple) and the same something on "other" groups. minimal example if(!require(pacman)) install.packages(pacman) pacman::p_load(dplyr) df <- data_frame( group = c('a', 'a', 'b', 'b', 'c', 'c'), value = c(1, 2, 3, 4, 5, 6) ) res <- df %>% group_by(group) %>% summarize( median = median(value) # median_other = ... ??? ... # I need the median of all

Aggregate by multiple columns and reshape from long to wide

岁酱吖の 提交于 2019-12-29 09:36:23
问题 There are some questions similar to this topic on SO but not exactly like my usecase. I have a dataset where the columns are laid out as shown below Id Description Value 10 Cat 19 10 Cat 20 10 Cat 5 10 Cat 13 11 Cat 17 11 Cat 23 11 Cat 7 11 Cat 14 10 Dog 19 10 Dog 20 10 Dog 5 10 Dog 13 11 Dog 17 11 Dog 23 11 Dog 7 11 Dog 14 What I am trying to do is capture the mean of the Value column by Id, Description. The final dataset would look like this. Id Cat Dog 10 14.25 28.5 11 15.25 15.25 I can do

data.table: Using with=False and transforming function/summary function?

筅森魡賤 提交于 2019-12-23 10:17:04
问题 I want to summarise several variables in data.table, output in wide format, output possibly as a list per variable. Since several other approaches did not work, I tried to do an outer lapply, giving the names of the variables as character vectors. I wanted to pass these in, using with=FALSE. carsx=as.data.table(cars) lapply( list(speed="speed",dist= "dist"), #error object 'ansvals' not found function(x) carsx[,list(mean(x), min(x), max(x) ), with=FALSE ] ) Since this does not work, I tried

How to create summaries of subgroups based on factors in R

陌路散爱 提交于 2019-12-11 08:35:38
问题 I want to calculate the mean for each numeric variable in the following example. These need to be grouped by each factor associated with "id" and by each factor associated with"status". set.seed(10) dfex <- data.frame(id=c("2","1","1","1","3","2","3"),status=c("hit","miss","miss","hit","miss","miss","miss"),var3=rnorm(7),var4=rnorm(7),var5=rnorm(7),var6=rnorm(7)) For the means of "id" groups, the first row of output would be labeled "mean-id-1". Rows labeled "mean-id-2" and "mean-id-3" would

Parallel wilcox.test using group_by and summarise

扶醉桌前 提交于 2019-12-09 12:41:43
问题 There must be an R-ly way to call wilcox.test over multiple observations in parallel using group_by. I've spent a good deal of time reading up on this but still can't figure out a call to wilcox.test that does the job. Example data and code below, using magrittr pipes and summarize() . library(dplyr) library(magrittr) # create a data frame where x is the dependent variable, id1 is a category variable (here with five levels), and id2 is a binary category variable used for the two-sample

How to add a weighted average summary to a DevExpress XtraGrid?

做~自己de王妃 提交于 2019-12-06 10:53:38
问题 The DevExpress Grid (XtraGrid) allows grids and their groups to have summary calculations. The available options are Count, Max, Min, Avg, Sum, None and Custom. Has anyone got some sample code that shows how to calculate a weighted average column, based upon the weightings provided as values in another column? 回答1: I ended up working this out, and will post my solution here in case others find it useful. If a weighted average consists of both a value and a weight per row, then column that