How to use a Lucene Analyzer to tokenize a String?

后端 未结 4 1319
抹茶落季
抹茶落季 2020-12-04 16:38

Is there a simple way I could use any subclass of Lucene\'s Analyzer to parse/tokenize a String?

Something like:

String to_         


        
4条回答
  •  忘掉有多难
    2020-12-04 17:25

    Based off of the answer above, this is slightly modified to work with Lucene 4.0.

    public final class LuceneUtil {
    
      private LuceneUtil() {}
    
      public static List tokenizeString(Analyzer analyzer, String string) {
        List result = new ArrayList();
        try {
          TokenStream stream  = analyzer.tokenStream(null, new StringReader(string));
          stream.reset();
          while (stream.incrementToken()) {
            result.add(stream.getAttribute(CharTermAttribute.class).toString());
          }
        } catch (IOException e) {
          // not thrown b/c we're using a string reader...
          throw new RuntimeException(e);
        }
        return result;
      }
    
    }
    

提交回复
热议问题