How to implement tag system

后端未结

关注

 7  1668

悲&欢浪女 2020-11-28 00:19

I was wondering what the best way is to implement a tag system, like the one used on SO. I was thinking of this but I can\'t come up with a good scalable solution.

7条回答

遥遥无期 (楼主)

2020-11-28 00:54

I would like to suggest optimised MySQLicious for better performance. Before that the drawbacks of Toxi (3 table) solution is

If you have millions of questions, and it has 5 tags in each, then there will be 5 million entries in tagmap table. So first we have to filter out 10 thousand tagmap entries based on tag search then again filter out matching questions of those 10 thousand. So while filtering out if the artical id is simple numeric then it is ok, but if it is kind of UUID (32 varchar) then filtering out needs larger comparison though it is indexed.

My solution:

Whenever new tag is created, have counter++ (base 10), and convert that counter into base64. Now each tag name will have base64 id. and pass this id to UI along with name. This way you will be having maximum of two char id till we have 4095 tags created in our system. Now concatenate these multiple tags into each question table tag column. Add delimiter as well and make it sorted.

So table looks like this

While querying, query on id instead of real tag name. Since it is SORTED, and condition on tag will be more efficient (LIKE '%|a|%|c|%|f|%).

Note that single space delimiter is not enough and we need double delimiter to differentiate tags like sql and mysql because LIKE "%sql%" will return mysql results as well. Should be LIKE "%|sql|%"

I know the search is non indexed but still you might have indexed on other columns related to article like author/dateTime else will lead to full table scan.

Finally with this solution, no inner join required where million records have to be compared with 5 millions records on join condition.

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...