I have a standard Lucene app which searches from an index. My index contains a lot of french terms and I\'d like to use the ASCIIFoldingFilter.
I\'ve done a lot o
The structure of the Analyzer
abstract class seems to have been changed over the years. The method tokenStream
is set to final
in the current release (v4.9.0). The following class should do the work:
// Accent insensitive analyzer
public class AccentInsensitiveAnalyzer extends StopwordAnalyzerBase {
public AccentInsensitiveAnalyzer(Version matchVersion){
super(matchVersion, StandardAnalyzer.STOP_WORDS_SET);
}
@Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
final Tokenizer source = new StandardTokenizer(matchVersion, reader);
TokenStream tokenStream = source;
tokenStream = new StandardFilter(matchVersion, tokenStream);
tokenStream = new LowerCaseFilter(tokenStream);
tokenStream = new StopFilter(matchVersion, tokenStream, getStopwordSet());
tokenStream = new ASCIIFoldingFilter(tokenStream);
return new TokenStreamComponents(source, tokenStream);
}
}