Words normalization using RDD
问题 Maybe this question is a little bit strange... But I'll try to ask it. Everyone, who wrote applications with using Lucene API, seen something like this: public static String removeStopWordsAndGetNorm(String text, String[] stopWords, Normalizer normalizer) throws IOException { TokenStream tokenStream = new ClassicTokenizer(Version.LUCENE_44, new StringReader(text)); tokenStream = new StopFilter(Version.LUCENE_44, tokenStream, StopFilter.makeStopSet(Version.LUCENE_44, stopWords, true));