问题
I need to implement Categorization and Sub-Categorization on something which is a bit similar to golden pages.
Assume I have the following table:
Category Table
CategoryId, Title
10, Home
20, Business
30, Hobbies
I have two options to code the sub-categorization.
OPTION 1 - Subcategory Id is unique within Category ONLY:
Sub Category Table
CategoryId, SubCategoryId, Title
10, 100, Gardening
10, 110, Kitchen
10, 120, ...
20, 100, Development
20, 110, Marketing
20, 120, ...
30, 100, Soccer
30, 110, Reading
30, 120, ...
OPTION 2 - Subcategory Id is unique OVERALL:
Sub Category Table
CategoryId, SubCategoryId, Title
10, 100, Gardening
10, 110, Kitchen
10, 120, ...
20, 130, Development
20, 140, Marketing
20, 150, ...
30, 160, Soccer
30, 170, Reading
30, 180, ...
Option 2 sounds like it is easier to fetch rows from table
For example: SELECT BizTitle FROM tblBiz WHERE SubCatId = 170
whereas using Option 1 I'd have to write something like this:
SELECT BizTitle FROM tblBiz WHERE CatId = 30 AND SubCatId = 170
i.e., containing an extra AND
However, Option 1 is easier to maintain manually (when I need to update and insert new subcategories etc. and it is more pleasant to the eye in my opinion.
Any thoughts about it? Does Option 2 worth the trouble in terms of efficiency? Is there any design patters related with this common issue?
回答1:
I would use this structure:
ParentId, CategoryId, Title
null, 1, Home
null, 2, Business
null, 3, Hobbies
1, 4, Gardening
1, 5, Kitchen
1, 6, ...
2, 7, Development
2, 8, Marketing
2, 9, ...
3, 10, Soccer
3, 11, Reading
3, 12, ...
In detail:
- only use one table, which references itself, so that you can have unlimited depth of categories
- use technical ids (using
IDENTITY
, or similar), so that you can have more than 10 subcategories - if required add a human readable column for category-numbers as separate field
As long as you are only using two levels of categories you can still select like this:
SELECT BizTitle FROM tblBiz WHERE ParentId = 3 AND CategoryId = 11
The new hierarchyid
feature of SQL server also looks quite promising: https://msdn.microsoft.com/en-us/library/bb677173.aspx
What I don't like about the Nested Set Model:
- Inserting and deleting items in the Nested Set Model is a quite comlicated thing and requires expensive locks.
- One can easily create inconsistencies which is prohibited, if you use the
parent
field in combination with a foreign key constraint.- Inconsistencies can appear, if
rght
is lower thanlft
- Inconsistencies can appear, if a value apprears in several
rght
orlft
fields - Inconsistencies can appear, if you create gaps
- Inconsistencies can appear, if you create overlaps
- Inconsistencies can appear, if
- The Nested Set Model is in my opinion more complex and therefore not as easy to understand. This is absolutely subjective, of course.
- The Nested Set Model requires two fields, instead of one - and so uses more disk space.
回答2:
Managing hierarchical data has some ways. One of the most important one is Nested Set Model
.
See here for implementation.
Even some content management systems like joomla, use this structure.
回答3:
I'd recommend going with option 1 - keep sub-category unique within category. Let's say I have two categories of unrelated items:
- Fruits
- Colors
For each I want a subcategory
Fruits
- Apple
- Orange
Colors
- White
- Orange
Notice that Orange is a sub-category within each category. Although the name is the same, it's function is very different. (Let's not get into the possibility that Orange fruit is orange in color)
With this design, if someone changes their mind and wants to rename Orange to Oranges, fine. It's easy to change without affecting Orange sub-category under Colors.
If your UI is built in such a way that Marketing can control subcategories of Colors whereas Production can control subcategories of Fruits, this design will allow Marketing to work with their subcategories without stepping over Production's subcategories.
来源:https://stackoverflow.com/questions/31623575/database-design-categories-and-sub-categories