发表新帖

发表新帖

Using a Combination of Wildcards and Stemming

前端未结

关注

 4  1884

闹比i 2020-12-30 09:38

I\'m using a snowball analyzer to stem the titles of multiple documents. Everything works well, but their are some quirks.

Example:

A search for \"valv\", \

4条回答

无人及你 (楼主)

2020-12-30 10:28
I used 2 different approach to solve this before
1. Use two fields, one that contain stemmed terms, the other one containing terms generated by say, the StandardAnalyzer. When you parse the search query if its a wildcard search in the "standard" field, if not use the field with stemmed terms. This may be harder to use if you have the user input their queries directly in the Lucene's QueryParser.
2. Write a custom analyzer and index overlapping tokens. It basically consist of indexing the original term and the stem at the same position in the index using the PositionIncrementAttribute. You can look into SynonymFilter to get some example of how to use the PositionIncrementAttribute correctly.
I Prefer solution #2.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题