Algorithm for autocomplete?

后端 未结 9 1276
暖寄归人
暖寄归人 2020-11-28 00:45

I am referring to the algorithm that is used to give query suggestions when a user types a search term in Google.

I am mainly interested in: 1. Most important resul

9条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-11-28 01:12

    I'd just like to say... A good solution to this problem is going to incorporate more than a Ternary Search Tree. Ngrams, and Shingles (Phrases) are needed. Word-boundary errors also need to be detected. "hell o" should be "hello" ... and "whitesocks" should be "white socks" - these are pre-processing steps. If you don't preprocess the data properly you aren't going to get valuable search results. Ternary search trees are a useful component in figuring out what is a word, and also for implementing related-word guessing when a word typed isn't a valid word in the index.

    The google algorithm performs phrase suggestion and correction. The google algorithm also has some concept of context... if the first word you search for is weather related and you combine them "weatherforcst" vs "monsoonfrcst" vs "deskfrcst" - my guess is behind the scenes rankings are being changed in the suggestion based on the first word encountered - forecast and weather are related words therefore forecast get's a high rank in the Did-You-Mean guess.

    word-partials (ngrams), phrase-terms (shingles), word-proximity (word-clustering-index), ternary-search-tree (word lookup).

提交回复
热议问题