Perform Joins in O(n) time?

大憨熊 提交于 2020-01-16 06:13:08

问题


is there a way to Join 2 tables in linear time? I heard this can be done by having another data structure (Hashtable), but I'm not sure how this can be done. I was always wondering a Join will involve a cross-product and hence it is O(n^2).


回答1:


Algorithm:

Loop through table A. Hash all Items, Add them to the Join array.
Loop through table B, check each item if it's in the hash table (Check - O(1)), if not, add to the Join table.




回答2:


If there are indexes available on columns used in the join, it's linear because the indexes allow an in-order traversal of both tables. (That's not counting the amortized index cost, of course.)

A hash join will be sort-of linear, though the hashing itself isn't free, and when the keys involved are long then the costs also go up.




回答3:


It depends on the type of join. A cross join is always going to be O(n^2) since it has to produce O(n^2) records. An equi-join can be done with better complexity (O(n log(n)) or perhaps even amortized O(n)), provided right data structures are employed.




回答4:


You can join two tables in close to O(n) by using a hash table to look up records in one table based on the id of the other table.

Well, actually the operation will be close to O(n+m), where n and m are the number of items in the two tables. You would first loop through the records in one table to build a hash table from the key in that table, then you would loop through the other table to look up a match in the hash table for each of the records.

Looking up an item in a hash table is not an O(1) operation, but it's close. With more data you will have a few more hash collisions, so some of the lookups need to do more than one comparison.




回答5:


Major db vendors deprecated hash indexes long-long time ago. Therefore, joining 2 tables in O(max(n,m)) time is something that really doesn't matter in practice. With standard B-tree indexes join complexity is O(min(n,m)*log(max(n,m)).



来源:https://stackoverflow.com/questions/5557964/perform-joins-in-on-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!