Mysql full text indexing limitations?

情到浓时终转凉″ 提交于 2019-12-19 09:03:03

问题


What are the limitations or gotchas or antipatterns or pitfalls?

It seems pretty attractive, apparently you can create a search engine with almost no work. But it cannot be without its problems...

what are your experiences?


回答1:


In my opinion, the greatest drawback is that the MySQL full text indexing is limited to MyISAM tables. As oppsosed to InnoDB tables, those lack a lot of important features, e.g. transactions.




回答2:


it cannot be without its problems...

It certainly isn't!

Any search term composed purely of blocked words will silently fail. Words can be blocked due to min/max length restrictions and/or the stopword file.

I found the default stopword file far too aggressive, it was preventing many valid searches. Also the default minimum length of 4 was kicking in very often for acronyms people might want to search for. I reduced the ft_min_word_len to 3 and removed the stoplist completely (ft_stopword_file=''). Doc: http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html

You could also examine the search query to see if it contains only <4-letter words, and fall back to a LIKE search in that case. There's no such easy way to get around the stoplist at an application level.

The selection of ‘word characters’ may not meet your needs, and it's tricky to change. For example searching for “Terry” won't match “Terry's”. In general there is no support for any kind of stemming, so “biscuit” won't match “biscuits” either.

Finally, as cg mentioned, there is no support for InnoDB. In this day and age, you don't want to be putting all your data in a MyISAM table.

If you have the storage to spare, what you can do is put the main, canonical version of the data in an InnoDB table, and then create a separate MyISAM table that contains a copy of the freetext content, purely for use as searchbait. You do have to update both tables on a change, but if the MyISAM table loses integrity then at least you only lose the ability to search over the rows concerned, instead of bumming up the real live data and getting application errors.

You can then, if you have the cycles to spare, implement your own text processing on the searchbait and query words to get around some of the above limitations. For example you can escape characters you want to be word-characters, remove characters you don't want to be word-characters, and perform simple manual English stemming.




回答3:


For large tables, you will need to increase your buffer size and cache limit in your MySQL config file.

Also the MATCH() columns you use in the search need to be the same as the columns in the index.




回答4:


In addition to bobince's very good answer, there is an article in MySQL documentation which talks about restrictions of full-text. Hope this helps. http://dev.mysql.com/doc/refman/5.0/en/fulltext-restrictions.html (Olafur Waage talked about one of these already)



来源:https://stackoverflow.com/questions/609935/mysql-full-text-indexing-limitations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!