Get matched terms from Lucene query

前端 未结 5 1099
南笙
南笙 2020-12-03 12:49

Given a Lucene search query like: +(letter:A letter:B letter:C) +(style:Capital), how can I tell which of the three letters actually matched any given document?

5条回答
  •  误落风尘
    2020-12-03 13:48

    I basically used the same approach as @L.B, but updated it for usage for the newest Lucene Version 7.4.0. Note: FuzzyQuery now supports .setRewriteMethod (that's why I removed the if).

    I also included handling for BoostQuerys and saved the words that were found by Lucene in a HashSet to avoid duplicates instead of the Terms.

    private void saveHitWordInList(Query query, IndexSearcher indexSearcher,
        int docId, HashSet hitWords) throws IOException {
      if (query instanceof TermQuery)
        if (indexSearcher.explain(query, docId).isMatch())
          hitWords.add(((TermQuery) query).getTerm().toString().split(":")[1]);
      if (query instanceof BooleanQuery) {
        for (BooleanClause clause : (BooleanQuery) query) {
          saveHitWordInList(clause.getQuery(), indexSearcher, docId, hitWords);
        }
      }
    
      if (query instanceof MultiTermQuery) {
        ((MultiTermQuery) query)
            .setRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_REWRITE);
        saveHitWordInList(query.rewrite(indexSearcher.getIndexReader()),
            indexSearcher, docId, hitWords);
      }
    
      if (query instanceof BoostQuery)
        saveHitWordInList(((BoostQuery) query).getQuery(), indexSearcher, docId,
            hitWords);
    }
    

提交回复
热议问题