Optimized SQL for tree structures

前端未结

关注

 11  943

耶瑟儿～ 2020-11-28 21:39

How would you get tree-structured data from a database with the best performance? For example, say you have a folder-hierarchy in a database. Where the folder-database-row h

11条回答

误落风尘 (楼主)

2020-11-28 22:46
There are several common kinds of queries against a hierarchy. Most other kinds of queries are variations on these.
1. From a parent, find all children.
  
  a. To a specific depth. For example, given my immediate parent, all children to a depth of 1 will be my siblings.
  
  b. To the bottom of the tree.
2. From a child, find all parents.
  
  a. To a specific depth. For example, my immediate parent is parents to a depth of 1.
  
  b. To an unlimited depth.
The (a) cases (a specific depth) are easier in SQL. The special case (depth=1) is trivial in SQL. The non-zero depth is harder. A finite, but non-zero depth, can be done via a finite number of joins. The (b) cases, with indefinite depth (to the top, to the bottom), are really hard.

If you tree is HUGE (millions of nodes) then you're in a world of hurt no matter what you try to do.

If your tree is under a million nodes, just fetch it all into memory and work on it there. Life is much simpler in an OO world. Simply fetch the rows and build the tree as the rows are returned.

If you have a Huge tree, you have two choices.
- Recursive cursors to handle the unlimited fetching. This means the maintenance of the structure is O(1) -- just update a few nodes and you're done. However fetching is O(n*log(n)) because you have to open a cursor for each node with children.
- Clever "heap numbering" algorithms can encode the parentage of each node. Once each node is properly numbered, a trivial SQL SELECT can be used for all four types of queries. Changes to the tree structure, however, require renumbering the nodes, making the cost of a change fairly high compared to the cost of retrieval.
0 讨论(0)

查看其它11个回答
发布评论:

提交评论
- 加载中...