aggregate-functions | 易学教程

get the SUM of each Person by the PersonID

阅读更多关于 get the SUM of each Person by the PersonID

问题 I have the following columns in a table: SCORE_ID SCORE_PERSON_ID SCORE_VOTE The SCORE_PERSON_ID is a variable. I need to sum the SCORE_VOTE per SCORE_PERSON_ID. Can you suggest of a good way to do that? 回答1: You need a GROUP BY and an aggregate function like count or sum SELECT SCORE_PERSON_ID, sum(SCORE_VOTE) as score FROM table GROUP BY `SCORE_PERSON_ID` 回答2: SELECT SUM(SCORE_VOTE) FROM SCORES GROUP BY SCORE_PERSON_ID 回答3: how about select sum(SCORE_VOTE) as score from TABLE group by SCORE

Array_agg in postgres selectively quotes

阅读更多关于 Array_agg in postgres selectively quotes

问题 I have a complex database with keys and values stored in different tables. It is useful for me to aggregate them when pulling out the values for the application: SELECT array_agg(key_name), array_agg(vals) FROM ( SELECT id, key_name, array_agg(value)::VARCHAR(255) AS vals FROM factor_key_values WHERE id=20 GROUP BY key_name, id ) f; This particular query, in my case gives the following invalid JSON: -[ RECORD 1 ]----------------------------------------------------------------------- array_agg

Grouping based on sequence of rows

阅读更多关于 Grouping based on sequence of rows

问题 I have a table of orders with a column denoting whether it's a buy or a sell, with the rows typically ordered by timestamp. What I'd like to do is operate on groups of consecutive buys, plus their sell. e.g. B B S B S B B S -> (B B S) (B S) (B B S) Example: order_action | timestamp -------------+--------------------- buy | 2013-10-03 13:03:02 buy | 2013-10-08 13:03:02 sell | 2013-10-10 15:58:02 buy | 2013-11-01 09:30:02 buy | 2013-11-01 14:03:02 sell | 2013-11-07 10:34:02 buy | 2013-12-03 15

MySQL finding the most expensive in each zip code

阅读更多关于 MySQL finding the most expensive in each zip code

问题 I have a table called Products with the schema (name, city, state, zip_code price). And I want to find the most expensive products' name for a given state's each zip_code. I wrote SELECT zip_code, MAX(price) FROM products WHERE products.state = 'NJ' GROUP BY zip_code as a subquery, but I couldn't figure out displaying product name and price per zip_code in 'NJ' I would appreciate if you can help me, Thanks. 回答1: SELECT t.name, t.city, t.zip_code, t.price FROM ( SELECT zip_code , MAX(price) as

Find duplicated values on array column

阅读更多关于 Find duplicated values on array column

问题 I have a table with a array column like this: my_table id array -- ----------- 1 {1, 3, 4, 5} 2 {19,2, 4, 9} 3 {23,46, 87, 6} 4 {199,24, 93, 6} And i want as result what and where is the repeated values, like this: value_repeated is_repeated_on -------------- ----------- 4 {1,2} 6 {3,4} Is it possible? I don't know how to do this. I don't how to start it! I'm lost! 回答1: Use unnest to convert the array to rows, and then array_agg to build an array from the id s It should look something like

Add up conditional counts on multiple columns of the same table

阅读更多关于 Add up conditional counts on multiple columns of the same table

问题 I am looking for a "better" way to perform a query in which I want to show a single player who he has played previously and the associated win-loss record for each such opponent. Here are the tables involved stripped down to essentials: create table player (player_id int, username text); create table match (winner_id int, loser_id int); insert into player values (1, 'john'), (2, 'mary'), (3, 'bob'), (4, 'alice'); insert into match values (1, 2), (1, 2), (1, 3), (1, 4), (1, 4), (1, 4) , (2, 1)

Multiple aggregation in group by in Pandas Dataframe

阅读更多关于 Multiple aggregation in group by in Pandas Dataframe

问题 SQL : Select Max(A) , Min (B) , C from Table group by C I want to do the same operation in pandas on a dataframe. The closer I got was till : DF2= DF1.groupby(by=['C']).max() where I land up getting max of both the columns , how do i do more than one operation while grouping by. 回答1: try agg() function: import numpy as np import pandas as pd df = pd.DataFrame(np.random.randint(0,5,size=(20, 3)), columns=list('ABC')) print(df) print(df.groupby('C').agg({'A': max, 'B':min})) Output: A B C 0 2 3

Percentile aggregate for SQL Server 2008 R2

阅读更多关于 Percentile aggregate for SQL Server 2008 R2

问题 I'm using SQL Server 2008 R2. I need to compute a percentile value per group, something like: SELECT id, PCTL(0.9, x) -- for the 90th percentile FROM my_table GROUP BY id ORDER BY id For example, given this DDL (fiddle) --- CREATE TABLE my_table (id INT, x REAL); INSERT INTO my_table VALUES (7, 0.164595), (5, 0.671311), (7, 0.0118385), (6, 0.704592), (3, 0.633521), (3, 0.337268), (0, 0.54739), (6, 0.312282), (0, 0.220618), (7, 0.214973), (6, 0.410768), (7, 0.151572), (7, 0.0639506), (5, 0

Aggregate variables in list of data frames into single data frame

阅读更多关于 Aggregate variables in list of data frames into single data frame

问题 I am performing a per policy life insurance valuation in R. Monthly cash flow projections are performed per policy and returns a data frame in the following format (for example): Policy1 = data.frame(ProjM = 1:200, Cashflow1 = rep(5,200), Cashflow2 = rep(10,200)) My model returns a list (using lapply and a function which performs the per policy cashflow projection - based on various per policy details, escalation assumptions and life contingencies). I want to aggregate the cash flows across

Spark UDAF - using generics as input type?

阅读更多关于 Spark UDAF - using generics as input type?

问题 I want to write Spark UDAF where type of the column could be any that has a Scala Numeric defined on it. I've searched over Internet but found only examples with concrete types like DoubleType , LongType . Isn't this possible? But how then use that UDAFs with other numeric values? 回答1: For simplicity let's assume you want to define a custom sum . You'll have provide a TypeTag for the input type and use Scala reflection to define schemas: import org.apache.spark.sql.expressions._ import org