问题
Database System Concepts introduce several ways to implement a join operation. Two of them are merge join and hash join.
- I was wondering when the optimizer decides to use a merge join and when a hash join?
In particular, from https://stackoverflow.com/a/1114288/156458
hash joins can only be used for equi-joins, but merge joins are more flexible.
But Database System Concepts says both are used only for equi joins and natural joins.
The merge-join algorithm (also called the sort-merge-join algorithm) can be used to compute natural joins and equi-joins.
...
Like the merge-join algorithm, the hash-join algorithm can be used to implement natural joins and equi-joins.
Thanks.
My question comes from PostgreSQL document, where there are two examples, and I am not sure why one uses merge join, and the other hash join:
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------
Hash Join (cost=230.47..713.98 rows=101 width=488)
Hash Cond: (t2.unique2 = t1.unique2)
-> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244)
-> Hash (cost=229.20..229.20 rows=101 width=244)
-> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244)
Recheck Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique1
(cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
and
EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------
Merge Join (cost=198.11..268.19 rows=10 width=488)
Merge Cond: (t1.unique2 = t2.unique2)
-> Index Scan using tenk1_unique2 on tenk1 t1 (cost=0.29..656.28 rows=101 width=244)
Filter: (unique1 < 100)
-> Sort (cost=197.83..200.33 rows=1000 width=244)
Sort Key: t2.unique2
-> Seq Scan on onek t2 (cost=0.00..148.00 rows=1000 width=244)
来源:https://stackoverflow.com/questions/50987379/how-does-the-optimizer-decide-between-merge-join-and-hash-join