How to correct the user input (Kind of google “did you mean?”)

后端 未结 8 1283
粉色の甜心
粉色の甜心 2021-01-30 23:50

I have the following requirement: -

I have many (say 1 million) values (names). The user will type a search string.

I don\'t expect the user to spell the names c

8条回答
  •  萌比男神i
    2021-01-31 00:21

    Just use Solr or a similar search server, and then you won't have to be an expert in the subject. With the list of spelling suggestions, run a search with each suggested result, and if there are more results than the current search query, add that as a "did you mean" result. (This prevents bogus spelling suggestions that don't actually return more relevant hits.) This way, you don't require a lot of data to be collected to make an initial "did you mean" offering, though Solr has mechanisms by which you can hand-tune the results of certain queries.

    Generally, you wouldn't be using an RDBMS for this type of searching, instead depending on read-only, slightly stale databases intended for this purpose. (Solr adds a friendly programming interface and configuration to an underlying Lucene engine and database.) On the Web site for the company that I work for, a nightly service selects altered records from the RDBMS and pushes them as a documents into Solr. With very little effort, we have a system where the search box can search products, customer reviews, Web site pages, and blog entries very efficiently and offer spelling suggestions in the search results, as well as faceted browsing such as you see at NewEgg, Netflix, or Home Depot, with very little added strain on the server (particularly the RDBMS). (I believe both Zappo's [the new site] and Netflix use Solr internally, but don't quote me on that.)

    In your scenario, you'd be populating the Solr index with the list of names, and select an appropriate matching algorithm in the configuration file.

提交回复
热议问题