aggregate | 易学教程

Calculating mean for every second value in a dataframe

阅读更多关于 Calculating mean for every second value in a dataframe

问题 I would like to aggregate each two cell values by mean and continue with the same process down the column of the dataframe. To be more precise see the following dataframe extract: X Y Z 1 FRI 200101010000 -6.72 2 FRI 200101010030 -6.30 3 FRI 200101010100 -6.26 4 FRI 200101010130 -5.82 5 FRI 200101010200 -5.64 6 FRI 200101010230 -5.29 7 FRI 200101010300 -5.82 8 FRI 200101010330 -5.83 9 FRI 200101010400 -5.83 10 FRI 200101010430 -6.04 11 FRI 200101010500 -5.80 12 FRI 200101010530 -6.09 I would

how to integrate properties defined on multiple rows using a data.frame or data.table long format approach

阅读更多关于 how to integrate properties defined on multiple rows using a data.frame or data.table long format approach

问题 I have been recently starting to use the data.table package in R. I find it super-convenient for transforming and aggregating data. One thing that I miss is how do you transform data that are defined on multiple rows? Do I need to reshape the data.frame/table in a wide format first? Say you have the following data table: dt=data.table(group=c("a","a","a","b","b","b"), subg=c("f1","f2","f3","f1","f2","f3"), counts=c(3,4,5,8,9,10)) and for each group you want to calculate the relative frequency

Multiple Aggregation in R [duplicate]

阅读更多关于 Multiple Aggregation in R [duplicate]

问题 This question already has answers here : Apply several summary functions on several variables by group in one call (6 answers) Closed 5 years ago . I have three parameters (3 columns) x <- c(1, 1, 2, 2, 2, 2, 1, 1, 2) y <- c(1, 1, 1, 2, 2, 2, 3, 3, 3) and z <- c(10, NA, 16, 25, 41, NA, 17, 53, 26) I need for each y calculate the mean of column z , where x==1 How can I do it using the aggregate function in R? data <- data.frame(x=c(1, 1, 2, 2, 2, 2, 1, 1, 2), y=c(1, 1, 1, 2, 2, 2, 3, 3, 3), z

Do all groups have equal total power for given subgroup?

阅读更多关于 Do all groups have equal total power for given subgroup?

问题 I have a PostgreSQL table like this: CREATE TABLE foo (man_id, subgroup, power, grp) AS VALUES ( 1, 'Sub_A', 1, 'Group_A' ), ( 2, 'Sub_B', -1, 'Group_A' ), ( 3, 'Sub_A', -1, 'Group_B' ), ( 4, 'Sub_B', 1, 'Group_B' ), ( 5, 'Sub_A', -1, 'Group_A' ), ( 6, 'Sub_B', 1, 'Group_A' ), ( 7, 'Sub_A', -1, 'Group_B' ), ( 8, 'Sub_B', 1, 'Group_B' ); The power calculation works like this: Total Power of Subgroup Sub_A in the grp Group_A is (1 + (-1) ) = 0 Total Power of Subgroup Sub_B in the grp Group_A is

Skip whole row if aggregated value is null

阅读更多关于 Skip whole row if aggregated value is null

问题 This is my approach: select distinct (invoice_no) as no,sum(total), sum(case when department_id=2 then total end) as a2, sum(case when department_id=3 then total end) as a3, sum(case when department_id=4 then total end) as a4, sum(case when department_id=5 then total end) as a5, sum(case when department_id=6 then total end) as a6 from article_sale where invoice_date = '2018-10-01' group by no order by no ASC The query returns output like this: no sum a2 a3 a4 a5 a6 68630 690 NULL 75 404 NULL

Aggregate function and other columns

阅读更多关于 Aggregate function and other columns

问题 Is it possible for a SQL query to return some normal columns and some aggregate ones? like : Col_A | Col_B | SUM ------+-------+------ 5 | 6 | 7 回答1: You should use the group by statement. The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns. For example: SELECT column_name, aggregate_function(column_name) FROM table_name WHERE column_name operator value GROUP BY column_name You can see a complete example here. 回答2: Yes of

How can you find the rows with equal columns?

阅读更多关于 How can you find the rows with equal columns?

问题 If I have a table with important 2 columns, CREATE TABLE foo (id INT, a INT, b INT, KEY a, KEY b); How can I find all the rows that have both a and b being the same in both rows? For example, in this data set id | a | b ---------- 1 | 1 | 2 2 | 5 | 42 3 | 1 | 42 4 | 1 | 2 5 | 1 | 2 6 | 1 | 42 I want to get back all rows except for id=2 since it is unique in (a,b) . Basically, I want to find all offending rows that would stop a ALTER TABLE foo ADD UNIQUE (a, b); Something better than an n^2

existing function to combine standard deviations in R?

阅读更多关于 existing function to combine standard deviations in R?

问题 I have 4 populations with known means and standard deviations. I would like to know the grand mean and grand sd. The grand mean is obviously simple to calculate, but R has a handy utility function, weighted.mean(). Does a similar function exist for combining standard deviations? The calculation is not complicated, but an existing function would make my code cleaner and easier to understand. Bonus question, what tools do you use to search for functions like this? I know it must be out there,

Multiple aggregation in R with 4 parameters

阅读更多关于 Multiple aggregation in R with 4 parameters

问题 I have four vectors (columns) x y z t 1 1 1 10 1 1 1 15 2 4 1 14 2 3 1 15 2 2 1 17 2 1 2 19 1 4 2 18 1 4 2 NA 2 2 2 45 3 3 2 NA 3 1 3 59 4 3 3 23 1 4 3 45 4 4 4 74 2 1 4 86 How can I calculate mean and median of vector t, for each value of vector y (from 1 to 4) where x=1, z=1, using aggregate function in R? It was discussed how to do it with 3 parameters (Multiple Aggregation in R) but it`s a little unclear how to do it with 4 parameters. Thank you. 回答1: You could try something like this in

How to convert a daily times series into an averaged weekly?

阅读更多关于 How to convert a daily times series into an averaged weekly?

问题 I am wishing to (arithmetically) average daily data and thus convert my daily time series into a weekly one. Following this thread: How does one compute the mean of weekly data by column using R? , I am using the xts library. # Averages daily time series into weekly time series # where my source is a zoo object source.w <- apply.weekly(source, colMeans) The problem I am having is that it averages the series taking tuesday through next monday data. I am searching for options to average my