mysql hierarchy storage with large trees

青春壹個敷衍的年華 提交于 2019-11-29 15:25:21

问题


I don't know how to store my hierarchical data in my innoDB-table.

I've read a lot about the disadvantages of the method of storing the parent_id in each row. But now the problem is, that I have a very large database (~50 million rows). The hierarchy is mostly not very deep (3-6 levels).

Many websites advise taking the "Nested Set Model" as a better alternative to the parent-id-storing-method. But there are always changes being made (UPDATE, INSERT etc.) by the users of the website and because of the size of my table, this would take too much time (since changes in the "Nested Set Model" have a very low performance).

So my question is: How do you store efficiently large hierarchical data with many update/insert commands? (Also blocking the whole table is not an option [-> innoDB-table])


回答1:


The Nested Sets design is definitely difficult when you need to make frequent updates to the tree. You end up having to renumber large parts of the tree.

One suggestion for mitigating this is to use floating-point numbers instead of integers. If you insert a new node in the tree, it's relatively easy to find some FLOAT numbers in between the nested set numbers of the parent of the new node. You may eventually get to the limits of the precision of a floating-point number, but since your tree isn't very deep that won't happen for a long time.

Another technique which I have written about I call Closure Table. This method of storing hierarchies makes it much easier to insert/update/delete nodes in a large tree without needing to update a lot of your tree. And you can still query the whole tree or any subtree in a single non-recursive SQL query.

To read more about Closure Table, see:

  • What is the most efficient/elegant way to parse a flat table into a tree?
  • Models for Hierarchical Data with SQL and PHP
  • Moving Subtrees in Closure Table Hierarchies
  • SQL Antipatterns: Avoiding the Pitfalls of Database Programming

Re your comment:

Adjacency List is simple, has a minimum of redundancy, and it supports FK relationships, which Nested Sets does not.  Adjacency List supports querying a whole tree of arbitrary depth if you use recursive queries. But MySQL does not support recursive queries.

If you need to query only immediate parent-child relationships (i.e. one level of depth), or otherwise query only trees of fixed depth, then Adjacency List is fine.




回答2:


For hierarchical data I like to keep the hierarchy separate. For instance, if we were dealing with an employee hierarchy, I generally do something like this -

create table employee (
    id serial primary key,
    name varchar(50));

create table roster (
    id serial primary key,
    employee_id int references employee (id),
    supervisor_id int references employee (id));

This can be extended to provide historical hierarchies by adding either a row_date or start_date and stop_date fields to the roster table.

Be sure that you have unique constraints and triggers applied where applicable to enforce business rules.



来源:https://stackoverflow.com/questions/14114280/mysql-hierarchy-storage-with-large-trees

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!