aggregate-functions | 易学教程

How to find mean of grouped Vector columns in Spark SQL?

阅读更多关于 How to find mean of grouped Vector columns in Spark SQL?

问题 I have created a RelationalGroupedDataset by calling instances.groupBy(instances.col("property_name")) : val x = instances.groupBy(instances.col("property_name")) How do I compose a user-defined aggregate function to perform Statistics.colStats().mean on each group? Thanks! 回答1: Spark >= 2.4 You can use Summarizer : import org.apache.spark.ml.stat.Summarizer val dfNew = df.as[(Int, org.apache.spark.mllib.linalg.Vector)] .map { case (group, v) => (group, v.asML) } .toDF("group", "features")

Grouped string aggregation / LISTAGG for SQL Server

阅读更多关于 Grouped string aggregation / LISTAGG for SQL Server

问题 I'm sure this has been asked but I can't quite find the right search terms. Given a schema like this: | CarMakeID | CarMake ------------------------ | 1 | SuperCars | 2 | MehCars | CarMakeID | CarModelID | CarModel ----------------------------------------- | 1 | 1 | Zoom | 2 | 1 | Wow | 3 | 1 | Awesome | 4 | 2 | Mediocrity | 5 | 2 | YoureSettling I want to produce a dataset like this: | CarMakeID | CarMake | CarModels --------------------------------------------- | 1 | SuperCars | Zoom, Wow,

The SQL OVER() clause - when and why is it useful?

阅读更多关于 The SQL OVER() clause - when and why is it useful?

问题 USE AdventureWorks2008R2; GO SELECT SalesOrderID, ProductID, OrderQty ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total' ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg' ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count' ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min' ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max' FROM Sales.SalesOrderDetail WHERE SalesOrderID IN(43659,43664); I read about that clause and I don't understand why I need it. What does the

How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?

阅读更多关于 How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?

问题 I am trying to fetch the first and the last record of a 'grouped' record. More precisely, I am doing a query like this SELECT MIN(low_price), MAX(high_price), open, close FROM symbols WHERE date BETWEEN(.. ..) GROUP BY YEARWEEK(date) but I'd like to get the first and the last record of the group. It could by done by doing tons of requests but I have a quite large table. Is there a (low processing time if possible) way to do this with MySQL? 回答1: You want to use GROUP_CONCAT and SUBSTRING

How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?

阅读更多关于 How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?

How to include “zero” / “0” results in COUNT aggregate?

阅读更多关于 How to include “zero” / “0” results in COUNT aggregate?

问题 I've just got myself a little bit stuck with some SQL. I don't think I can phrase the question brilliantly - so let me show you. I have two tables, one called person, one called appointment. I'm trying to return the number of appointments a person has (including if they have zero). Appointment contains the person_id and there is a person_id per appointment. So COUNT(person_id) is a sensible approach. The query: SELECT person_id, COUNT(person_id) AS "number_of_appointments" FROM appointment

How to include “zero” / “0” results in COUNT aggregate?

阅读更多关于 How to include “zero” / “0” results in COUNT aggregate?

Optimal way to concatenate/aggregate strings

阅读更多关于 Optimal way to concatenate/aggregate strings

问题 I'm finding a way to aggregate strings from different rows into a single row. I'm looking to do this in many different places, so having a function to facilitate this would be nice. I've tried solutions using COALESCE and FOR XML , but they just don't cut it for me. String aggregation would do something like this: id | Name Result: id | Names -- - ---- -- - ----- 1 | Matt 1 | Matt, Rocks 1 | Rocks 2 | Stylus 2 | Stylus I've taken a look at CLR-defined aggregate functions as a replacement for

Select multiple row values into single row with multi-table clauses

阅读更多关于 Select multiple row values into single row with multi-table clauses

问题 I've searched the forums and while I see similar posts, they only address pieces of the full query I need to formulate (array_aggr, where exists, joins, etc.). If the question I'm posting has been answered, I will gladly accept references to those threads. I did find this thread ...which is very similar to what I need, except it is for MySQL, and I kept running into errors trying to get it into psql syntax. Hoping someone can help me get everything together. Here's the scenario: Attribute

How to use user variable as counter with inner join queries that contains GROUP BY statement?

阅读更多关于 How to use user variable as counter with inner join queries that contains GROUP BY statement?

问题 I have 2 tables odds and matches : matches : has match_id and match_date odds : has id , timestamp , result , odd_value , user_id , match_id I had a query that get the following information from those tables for each user: winnings : the winning bets for each user. (when odds.result = 1) loses : the lost bets for each user.(when odds.result != 1) points : the points of each user.(the sum of the odds.odd_value) for each user. bonus : for each continuous 5 winnings i want to add extra bonus to