What's the difference between a Table Scan and a Clustered Index Scan?

后端未结

关注

 3  1529

离开以前 2020-12-04 09:37

Since both a Table Scan and a Clustered Index Scan essentially scan all records in the table, why is a Clustered Index Scan supposedly better?

3条回答

失恋的感觉 (楼主)

2020-12-04 10:06
In a table without a clustered index (a heap table), data pages are not linked together - so traversing pages requires a lookup into the Index Allocation Map.

A clustered table, however, has it's data pages linked in a doubly linked list - making sequential scans a bit faster. Of course, in exchange, you have the overhead of dealing with keeping the data pages in order on INSERT, UPDATE, and DELETE. A heap table, however, requires a second write to the IAM.

If your query has a RANGE operator (e.g.: SELECT * FROM TABLE WHERE Id BETWEEN 1 AND 100), then a clustered table (being in a guaranteed order) would be more efficient - as it could use the index pages to find the relevant data page(s). A heap would have to scan all rows, since it cannot rely on ordering.

And, of course, a clustered index lets you do a CLUSTERED INDEX SEEK, which is pretty much optimal for performance...a heap with no indexes would always result in a table scan.

So:
- For your example query where you select all rows, the only difference is the doubly linked list a clustered index maintains. This should make your clustered table just a tiny bit faster than a heap with a large number of rows.
- For a query with a WHERE clause that can be (at least partially) satisfied by the clustered index, you'll come out ahead because of the ordering - so you won't have to scan the entire table.
- For a query that is not satisified by the clustered index, you're pretty much even...again, the only difference being that doubly linked list for sequential scanning. In either case, you're suboptimal.
- For INSERT, UPDATE, and DELETE a heap may or may not win. The heap doesn't have to maintain order, but does require a second write to the IAM. I think the relative performance difference would be negligible, but also pretty data dependent.
Microsoft has a whitepaper which compares a clustered index to an equivalent non-clustered index on a heap (not exactly the same as I discussed above, but close). Their conclusion is basically to put a clustered index on all tables. I'll do my best to summarize their results (again, note that they're really comparing a non-clustered index to a clustered index here - but I think it's relatively comparable):
- INSERT performance: clustered index wins by about 3% due to the second write needed for a heap.
- UPDATE performance: clustered index wins by about 8% due to the second lookup needed for a heap.
- DELETE performance: clustered index wins by about 18% due to the second lookup needed and the second delete needed from the IAM for a heap.
- single SELECT performance: clustered index wins by about 16% due to the second lookup needed for a heap.
- range SELECT performance: clustered index wins by about 29% due to the random ordering for a heap.
- concurrent INSERT: heap table wins by 30% under load due to page splits for the clustered index.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...