问题
USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN(43659,43664);
I read about that clause and I don't understand why I need it.
What does the function Over
do? What does Partitioning By
do?
Why can't I make a query with writing Group By SalesOrderID
?
回答1:
You can use GROUP BY SalesOrderID
. The difference is, with GROUP BY you can only have the aggregated values for the columns that are not included in GROUP BY.
In contrast, using windowed aggregate functions instead of GROUP BY, you can retrieve both aggregated and non-aggregated values. That is, although you are not doing that in your example query, you could retrieve both individual OrderQty
values and their sums, counts, averages etc. over groups of same SalesOrderID
s.
Here's a practical example of why windowed aggregates are great. Suppose you need to calculate what percent of a total every value is. Without windowed aggregates you'd have to first derive a list of aggregated values and then join it back to the original rowset, i.e. like this:
SELECT
orig.[Partition],
orig.Value,
orig.Value * 100.0 / agg.TotalValue AS ValuePercent
FROM OriginalRowset orig
INNER JOIN (
SELECT
[Partition],
SUM(Value) AS TotalValue
FROM OriginalRowset
GROUP BY [Partition]
) agg ON orig.[Partition] = agg.[Partition]
Now look how you can do the same with a windowed aggregate:
SELECT
[Partition],
Value,
Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent
FROM OriginalRowset orig
Much easier and cleaner, isn't it?
回答2:
The OVER
clause is powerful in that you can have aggregates over different ranges ("windowing"), whether you use a GROUP BY
or not
Example: get count per SalesOrderID
and count of all
SELECT
SalesOrderID, ProductID, OrderQty
,COUNT(OrderQty) AS 'Count'
,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail
WHERE
SalesOrderID IN(43659,43664)
GROUP BY
SalesOrderID, ProductID, OrderQty
Get different COUNT
s, no GROUP BY
SELECT
SalesOrderID, ProductID, OrderQty
,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail
WHERE
SalesOrderID IN(43659,43664)
回答3:
If you only wanted to GROUP BY the SalesOrderID then you wouldn't be able to include the ProductID and OrderQty columns in the SELECT clause.
The PARTITION BY clause let's you break up your aggregate functions. One obvious and useful example would be if you wanted to generate line numbers for order lines on an order:
SELECT
O.order_id,
O.order_date,
ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,
OL.product_id
FROM
Orders O
INNER JOIN Order_Lines OL ON OL.order_id = O.order_id
(My syntax might be off slightly)
You would then get back something like:
order_id order_date line_item_no product_id
-------- ---------- ------------ ----------
1 2011-05-02 1 5
1 2011-05-02 2 4
1 2011-05-02 3 7
2 2011-05-12 1 8
2 2011-05-12 2 1
回答4:
Let me explain with an example and you would be able to see how it works.
Assuming you have the following table DIM_EQUIPMENT:
VIN MAKE MODEL YEAR COLOR
-----------------------------------------
1234ASDF Ford Taurus 2008 White
1234JKLM Chevy Truck 2005 Green
5678ASDF Ford Mustang 2008 Yellow
Run below SQL
SELECT VIN,
MAKE,
MODEL,
YEAR,
COLOR ,
COUNT(*) OVER (PARTITION BY YEAR) AS COUNT2
FROM DIM_EQUIPMENT
The result would be as below
VIN MAKE MODEL YEAR COLOR COUNT2
----------------------------------------------
1234JKLM Chevy Truck 2005 Green 1
5678ASDF Ford Mustang 2008 Yellow 2
1234ASDF Ford Taurus 2008 White 2
See what happened.
You are able to count without Group By on YEAR and Match with ROW.
Another Interesting WAY to get same result if as below using WITH Clause, WITH works as in-line VIEW and can simplify the query especially complex ones, which is not the case here though since I am just trying to show usage
WITH EQ AS
( SELECT YEAR AS YEAR2, COUNT(*) AS COUNT2 FROM DIM_EQUIPMENT GROUP BY YEAR
)
SELECT VIN,
MAKE,
MODEL,
YEAR,
COLOR,
COUNT2
FROM DIM_EQUIPMENT,
EQ
WHERE EQ.YEAR2=DIM_EQUIPMENT.YEAR;
回答5:
The OVER clause when combined with PARTITION BY state that the preceding function call must be done analytically by evaluating the returned rows of the query. Think of it as an inline GROUP BY statement.
OVER (PARTITION BY SalesOrderID)
is stating that for SUM, AVG, etc... function, return the value OVER a subset of the returned records from the query, and PARTITION that subset BY the foreign key SalesOrderID.
So we will SUM every OrderQty record for EACH UNIQUE SalesOrderID, and that column name will be called 'Total'.
It is a MUCH more efficient means than using multiple inline views to find out the same information. You can put this query within an inline view and filter on Total then.
SELECT ...,
FROM (your query) inlineview
WHERE Total < 200
回答6:
- Also Called
Query Petition
Clause. Similar to the
Group By
Clause- break up data into chunks (or partitions)
- separate by partition bounds
- function performs within partitions
- re-initialised when crossing parting boundary
Syntax:
function (...) OVER (PARTITION BY col1 col3,...)
Functions
- Familiar functions such as
COUNT()
,SUM()
,MIN()
,MAX()
, etc - New Functions as well (eg
ROW_NUMBER()
,RATION_TO_REOIRT()
, etc.)
- Familiar functions such as
More info with example : http://msdn.microsoft.com/en-us/library/ms189461.aspx
回答7:
prkey whatsthat cash
890 "abb " 32 32
43 "abbz " 2 34
4 "bttu " 1 35
45 "gasstuff " 2 37
545 "gasz " 5 42
80009 "hoo " 9 51
2321 "ibm " 1 52
998 "krk " 2 54
42 "kx-5010 " 2 56
32 "lto " 4 60
543 "mp " 5 65
465 "multipower " 2 67
455 "O.N. " 1 68
7887 "prem " 7 75
434 "puma " 3 78
23 "retractble " 3 81
242 "Trujillo's stuff " 4 85
That's a result of query. Table used as source is the same exept that it has no last column. This column is a moving sum of third one.
Query:
SELECT prkey,whatsthat,cash,SUM(cash) over (order by whatsthat)
FROM public.iuk order by whatsthat,prkey
;
(table goes as public.iuk)
sql version: 2012
It's a little over dbase(1986) level, I don't know why 25+ years has been needed to finish it up.
来源:https://stackoverflow.com/questions/6218902/the-sql-over-clause-when-and-why-is-it-useful