Full-text search relevance is measured in?

后端 未结 3 1222
感动是毒
感动是毒 2020-12-30 05:14

I am making a quiz system, and when quizmakers insert questions into the Question Bank, I am to check the DB for duplicate / very highly similar questions.

Testing M

3条回答
  •  旧时难觅i
    2020-12-30 06:02

    andygeers is on the right track: Those numbers have no empirical meaning other than their relations to each other and cannot be used on their own to determine what is or is not an "exact match". You need to determine that yourself. Even aside from the limitations of fulltext search ranking, there's also the open question of just what you consider to consitiute an "exact match". (Actual text only or do soundex matches count? Do synonyms (e.g., "couch" vs. "sofa") count as matching or as distinct? Should an attempt be made to compensate for misspellings? Etc.)

    If I had the need to perform such a check, I would grab only the highest-ranked entry returned by the fulltext search, remove any designated stopwords, normalize whitespace, convert to lowercase, do the comparison, and leave it at that until I encountered a case that called for it to be refined further. It's not really all that much extra work - if you specify the language you're using for your application, you could probably find someone around here who could write the normalization function within a dozen or so lines of code.

提交回复
热议问题