aggregate-functions | 易学教程

Apply multiple functions to multiple groupby columns

阅读更多关于 Apply multiple functions to multiple groupby columns

问题 The docs show how to apply multiple functions on a groupby object at a time using a dict with the output column names as the keys: In [563]: grouped[\'D\'].agg({\'result1\' : np.sum, .....: \'result2\' : np.mean}) .....: Out[563]: result2 result1 A bar -0.579846 -1.739537 foo -0.280588 -1.402938 However, this only works on a Series groupby object. And when a dict is similarly passed to a groupby DataFrame, it expects the keys to be the column names that the function will be applied to. What I

SELECTING with multiple WHERE conditions on same column

阅读更多关于 SELECTING with multiple WHERE conditions on same column

问题 Ok, I think I might be overlooking something obvious/simple here... but I need to write a query that returns only records that match multiple criteria on the same column... My table is a very simple linking setup for applying flags to a user ... ID contactid flag flag_type ----------------------------------- 118 99 Volunteer 1 119 99 Uploaded 2 120 100 Via Import 3 121 100 Volunteer 1 122 100 Uploaded 2 etc... in this case you\'ll see both contact 99 and 100 are flagged as both \"Volunteer\"

Spark SQL: apply aggregate functions to a list of columns

阅读更多关于 Spark SQL: apply aggregate functions to a list of columns

问题 Is there a way to apply an aggregate function to all (or a list of) columns of a dataframe, when doing a groupBy ? In other words, is there a way to avoid doing this for every column: df.groupBy(\"col1\") .agg(sum(\"col2\").alias(\"col2\"), sum(\"col3\").alias(\"col3\"), ...) 回答1: There are multiple ways of applying aggregate functions to multiple columns. GroupedData class provides a number of methods for the most common functions, including count , max , min , mean and sum , which can be

Two SQL LEFT JOINS produce incorrect result

阅读更多关于 Two SQL LEFT JOINS produce incorrect result

问题 I have 3 tables: users(id, account_balance) grocery(user_id, date, amount_paid) fishmarket(user_id, date, amount_paid) Both fishmarket and grocery tables may have multiple occurrences for the same user_id with different dates and amounts paid or have nothing at all for any given user. When I try the following query: SELECT t1.\"id\" AS \"User ID\", t1.account_balance AS \"Account Balance\", count(t2.user_id) AS \"# of grocery visits\", count(t3.user_id) AS \"# of fishmarket visits\" FROM

SQL select only rows with max value on a column [duplicate]

阅读更多关于 SQL select only rows with max value on a column [duplicate]

问题 This question already has an answer here: Retrieving the last record in each group - MySQL 25 answers I have this table for documents (simplified version here): +------+-------+--------------------------------------+ | id | rev | content | +------+-------+--------------------------------------+ | 1 | 1 | ... | | 2 | 1 | ... | | 1 | 2 | ... | | 1 | 3 | ... | +------+-------+--------------------------------------+ How do I select one row per id and only the greatest rev? With the above data,

Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause [duplicate]

阅读更多关于 Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause [duplicate]

问题 This question already has answers here : Closed 6 years ago . Possible Duplicate: GROUP BY / aggregate function confusion in SQL I got an error - Column \'Employee.EmpID\' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. select loc.LocationID, emp.EmpID from Employee as emp full join Location as loc on emp.LocationID = loc.LocationID group by loc.LocationID This situation fits into the answer given by Bill Karwin. correction for