How do I use ASCIIFoldingFilter in my Lucene app?

后端未结

关注

 2  1243

伪装坚强ぢ 2020-12-21 08:36

I have a standard Lucene app which searches from an index. My index contains a lot of french terms and I\'d like to use the ASCIIFoldingFilter.

I\'ve done a lot o

2条回答

时光取名叫无心 (楼主)

2020-12-21 09:07

The structure of the Analyzer abstract class seems to have been changed over the years. The method tokenStream is set to final in the current release (v4.9.0). The following class should do the work:

// Accent insensitive analyzer
public class AccentInsensitiveAnalyzer extends StopwordAnalyzerBase {
    public AccentInsensitiveAnalyzer(Version matchVersion){
        super(matchVersion, StandardAnalyzer.STOP_WORDS_SET);
    }

    @Override
    protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
        final Tokenizer source = new StandardTokenizer(matchVersion, reader);

        TokenStream tokenStream = source;
        tokenStream = new StandardFilter(matchVersion, tokenStream);
        tokenStream = new LowerCaseFilter(tokenStream);
        tokenStream = new StopFilter(matchVersion, tokenStream, getStopwordSet());
        tokenStream = new ASCIIFoldingFilter(tokenStream);
        return new TokenStreamComponents(source, tokenStream);
    }
}

0 讨论(0)

查看其它2个回答