问题
Based on mongodb documentation
The ensureIndex() function only creates the index if it does not exist.
Once a collection is indexed on a key, random access on query expressions which match the specified key are fast. Without the index, MongoDB has to go through each document checking the value of specified key in the query:
db.things.find({j:2}); // fast - uses index
db.things.find({x:3}); // slow - has to check all because 'x' isn't
Does that mean the 1st line of code runtime is big_theta = 1, and 2nd line of code is big_theta = n ?
回答1:
MongoDB uses B-tree for indexing, as can be seen in the source code for index.cpp. This means that lookups can be expressed as O(log N) where N is the number of documents, but it is also O(D) if D is the depth of the tree (assuming the tree is somewhat balanced). D is usually very small because each node will have many children.
The number of children in a node in MongoDB is about 8192 (btree.h) so an index with a few billion documents may fit in a tree with only 3 levels! You easily realize that the logarithm is not log_2 (as in binary trees) but instead log_8192, which grows extremely slowly.
Because of this, b-trees are usually regarded as constant-time lookup, O(1), in practice.
Another good reason for keeping many children in each node is that the index is stored on disk. You want to try to utilize all the space in a disk block for one node to improve cache performance and reduce disk seeks. B-trees have very good disk performance because you only need to visit very few nodes to find what you are looking for.
回答2:
Mongo indexes are B-trees, so an indexed lookup is O(log n). Unindexed lookups are O(n).
回答3:
A B-tree is O(log N), Emil Vikström's answer of O(1) is simply incorrect. Even under his "motivation" (or assumption), it is wrong: he forgot the time to search the 8192 children for each node. In other words, if K is the size of the node, D is depth of the tree, the time complex can be re-expressed as O(D) + O(log K) (which is equivalent to O(log N)), if the children is organized as BST or in similar logarithmic structure.
来源:https://stackoverflow.com/questions/13239896/runtime-of-using-indexing-in-mongodb