Confused about Itzik Ben-Gan's Logical Query Processing order in his SQL Server 2005 book and SQL Server 2008 book

筅森魡賤 提交于 2019-12-17 16:23:11

问题


In the book Inside Microsoft SQL Server™ 2005 T-SQL Querying, the author Itzik Ben-Gan tell us the order or SQL Server 2005's logical query processing is:

(8)  SELECT (9) DISTINCT (11) <TOP_specification> <select_list>
(1)  FROM <left_table>
(3)       <join_type> JOIN <right_table>
(2)       ON <join_condition>
(4)  WHERE <where_condition>
(5)  GROUP BY <group_by_list>
(6)  WITH {CUBE | ROLLUP}
(7)  HAVING <having_condition>
(10) ORDER BY <order_by_list>

1.  FROM
2.  ON
3.  OUTER (join)
4.  WHERE
5.  GROUP BY
6.  CUBE | ROLLUP
7.  HAVING
8.  SELECT
9.  DISTINCT
10. ORDER BY  <---------------------- NOTE
11. TOP       <---------------------- NOTE

In his book Inside Microsoft SQL Server 2008 : T-SQL Querying, he tell us the following logical query processing order:

(1) FROM
(1-J1) Cartesian Product
(1-J2) ON Filter
(1-J3) Add Outer Rows
(2) WHERE
(3) GROUP BY
(4) HAVING
(5) SELECT
(5-1) Evaluate Expressions
(5-2) DISTINCT
(5-3) TOP       <---------------------- NOTE
(6) ORDER BY    <---------------------- NOTE

Note the order of TOP and ORDER BY in the upper excerpts from these books. They are just opposite. I think these two step is very important, and will give a totally different result with different order. I want to know whether SQL Server 2008 changed something in it's storage engine from SQL Server 2005 or something else reason cause this?

Thanks.


回答1:


The logical processing order is also documented in this Books Online entry. Be careful to distinguish logical processing order from physical processing order. As the BOL entry notes:

The following steps show the logical processing order, or binding order, for a SELECT statement. This order determines when the objects defined in one step are made available to the clauses in subsequent steps. For example, if the query processor can bind to (access) the tables or views defined in the FROM clause, these objects and their columns are made available to all subsequent steps. Conversely, because the SELECT clause is step 8, any column aliases or derived columns defined in that clause cannot be referenced by preceding clauses. However, they can be referenced by subsequent clauses such as the ORDER BY clause. Note that the actual physical execution of the statement is determined by the query processor and the order may vary from this list.

The query optimizer is free to translate the logical requirement specified by the query into any physical execution plan that produces the correct results. Generally, there are many physical alternatives for a given logical query, so it is quite usual for a physical plan to differ fundamentally from the logical processing order (for binding purposes) described above.




回答2:


Check this out - it's an expose on this issue - and Itzik's book is mentioned. The second order above is correct.




回答3:


In his 2015 release of T-SQL Querying, Itzik Ben-Gan updated the logical query processing in "LISTING 1-1 Logical query-processing step numbers" as this:

(5) SELECT (5-2) DISTINCT (7) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate> 
       | (1-A) <left_table> <apply_type> APPLY <right_input_table> AS <alias> 
       | (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias> 
       | (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias> 
(2) WHERE <where_predicate> 
(3) GROUP BY <group_by_specification> 
(4) HAVING <having_predicate> 
(6) ORDER BY <order_by_list> 
(7) OFFSET <offset_specification> ROWS FETCH NEXT <fetch_specification> ROWS ONLY;

Unlike the previous 2008 version of the book, ORDER BY (step 6) now occurs before TOP (step 7). Of course, as stated elsewhere, the two are related. As Itzik explains on page 6 of T-SQL Querying:

[TOP] filters the specified number of rows based on the ordering in the ORDER BY clause, or based on arbitrary order if an ORDER BY clause is absent. With OFFSET-FETCH, this phase skips the specified number of rows, and then filters the next specified number of rows, based on the ordering in the ORDER BY clause.

Understanding the above logical query processing steps is important because it explains otherwise unintuitive quirks in SQL.

For example, you cannot use an alias from the SELECT clause (step 5-2) in the WHERE clause (step 2) because the expression engine evaluates the WHEREclause before the SELECT clause.

-- This won't work!
SELECT Cost = Quantity * Price
FROM Orders
WHERE Cost > 500.00;

As Itzik says in his book, "The phases in the logical processing of a query have a specific order. In contrast, the optimizer can often make shortcuts in the physical execution plan that it generates. Of course, it will make shortcuts only if the result set is guaranteed to be the correct one."



来源:https://stackoverflow.com/questions/3270338/confused-about-itzik-ben-gans-logical-query-processing-order-in-his-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!