What's the asymptotic complexity of GroupBy operation?

橙三吉。 提交于 2019-12-04 01:09:10

问题


I am interested in the asymptotic complexity (big O) of the GroupBy operation on unindexed datasets. What's the complexity of the best known algorithm and what's the complexity for algorithms that SQL servers and LINQ are using?


回答1:


Ignoring the base SQL that the group by is working on, when presented to the GROUP BY operation itself, the complexity is just O(n) since the data is scanned per-row and aggregated in one pass. It scales linearly to n (the size of the dataset).

When Group By is added to a complex query the equation changes, O(n) becomes the upper bound that the Group By adds to the overall equation; it could be less if the inner complex query is such that in the resolution of the base query, the data is already sorted.




回答2:


About Linq, I guess you want to know about the Linq-to-object group by complexity (Enumerable.GroupBy).

Checking the implementation with ILSpy, it appears to me it is O(n). (.Net Framework 4 series.)

It enumerates the source collection once. For each element, it computes its grouping key. Then it checks if it has already the key in a hashtable mapping to elements lists, adding the key to the hashtable if it is missing. Then it adds the element to the corresponding entry list in the hashtable.



来源:https://stackoverflow.com/questions/4889669/whats-the-asymptotic-complexity-of-groupby-operation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!