问题
Is there a standard on SQL implementaton for multiple calls to the same aggregate function in the same query?
For example, consider the following example, based on a popular example schema:
SELECT Customer,SUM(OrderPrice) FROM Orders
GROUP BY Customer
HAVING SUM(OrderPrice)>1000
Presumably, it takes computation time to calculate the value of SUM(OrderPrice). Is this cost incurred for each reference to the aggregate function, or is the result stored for a particular query?
Or, is there no standard for SQL engine implementation for this case?
回答1:
Although I have worked with many different DBMS, I will only show you the result of proving this on SQL Server. Consider this query, which even includes a CAST in the expression. Looking at the query plan, the expression sum(cast(number as bigint))
is only taken once, which is defined as DEFINE:([Expr1005]=SUM([Expr1006]))
.
set showplan_text on
select type, sum(cast(number as bigint))
from master..spt_values
group by type
having sum(cast(number as bigint)) > 100000
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Filter(WHERE:([Expr1005]>(100000)))
|--Hash Match(Aggregate, HASH:([Expr1004]), RESIDUAL:([Expr1004] = [Expr1004]) DEFINE:([Expr1005]=SUM([Expr1006])))
|--Compute Scalar(DEFINE:([Expr1004]=CONVERT(nchar(3),[mssqlsystemresource].[sys].[spt_values].[type],0), [Expr1006]=CONVERT(bigint,[mssqlsystemresource].[sys].[spt_values].[number],0)))
|--Index Scan(OBJECT:([mssqlsystemresource].[sys].[spt_values].[ix2_spt_values_nu_nc]))
It may not be very obvious above, since it doesn't show the SELECT result, so I have added a *10
to the query below. Notice that it now includes one extra step DEFINE:([Expr1006]=[Expr1005]*(10))
(steps run bottom to top) which demonstrates that the new expression required it to perform an extra calculation. Yet, even this is optimized, as it doesn't recalculate the entire expression - merely, it is taking Expr1005 and multiplying that by 10!
set showplan_text on
select type, sum(cast(number as bigint))*10
from master..spt_values
group by type
having sum(cast(number as bigint)) > 100000
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Compute Scalar(DEFINE:([Expr1006]=[Expr1005]*(10)))
|--Filter(WHERE:([Expr1005]>(100000)))
|--Hash Match(Aggregate, HASH:([Expr1004]), RESIDUAL:([Expr1004] = [Expr1004]) DEFINE:([Expr1005]=SUM([Expr1007])))
|--Compute Scalar(DEFINE:([Expr1004]=CONVERT(nchar(3),[mssqlsystemresource].[sys].[spt_values].[type],0), [Expr1007]=CONVERT(bigint,[mssqlsystemresource].[sys].[spt_values].[number],0)))
|--Index Scan(OBJECT:([mssqlsystemresource].[sys].[spt_values].[ix2_spt_values_nu_nc]))
This is very likely how all the other DBMS work as well, at least considering the major ones i.e. PostgreSQL, Sybase, Oracle, DB2, Firebird, MySQL.
来源:https://stackoverflow.com/questions/12876873/is-there-a-standard-for-sql-aggregate-function-calculation