How do I use ASCIIFoldingFilter in my Lucene app?

后端 未结 2 1243
伪装坚强ぢ
伪装坚强ぢ 2020-12-21 08:36

I have a standard Lucene app which searches from an index. My index contains a lot of french terms and I\'d like to use the ASCIIFoldingFilter.

I\'ve done a lot o

2条回答
  •  时光取名叫无心
    2020-12-21 09:07

    The structure of the Analyzer abstract class seems to have been changed over the years. The method tokenStream is set to final in the current release (v4.9.0). The following class should do the work:

    // Accent insensitive analyzer
    public class AccentInsensitiveAnalyzer extends StopwordAnalyzerBase {
        public AccentInsensitiveAnalyzer(Version matchVersion){
            super(matchVersion, StandardAnalyzer.STOP_WORDS_SET);
        }
    
        @Override
        protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            final Tokenizer source = new StandardTokenizer(matchVersion, reader);
    
            TokenStream tokenStream = source;
            tokenStream = new StandardFilter(matchVersion, tokenStream);
            tokenStream = new LowerCaseFilter(tokenStream);
            tokenStream = new StopFilter(matchVersion, tokenStream, getStopwordSet());
            tokenStream = new ASCIIFoldingFilter(tokenStream);
            return new TokenStreamComponents(source, tokenStream);
        }
    }
    

提交回复
热议问题