问题
I'm using nutch with Solr for a developing a search engine for Arabic texts. I need to implement a stemmer on my Arabic texts, and while serching on Solr Stemmer I found that it provide those two filters
<filter class="solr.ArabicNormalizationFilterFactory"/>
<filter class="solr.ArabicStemFilterFactory"/>
I tried them but did not understand what they do .. So please any one can help me with some examples??
and do these two do this:
العملات Stemmed to عملة
البسَاتِين ، بساتينكم Stemmed to بستان
thank you.
回答1:
You can find some details here: http://lucene.apache.org/core/3_6_0/api/contrib-analyzers/org/apache/lucene/analysis/ar/ArabicStemmer.html
That says:
Stemming is defined as:
- Removal of attached definite article, conjunction, and prepositions.
- Stemming of common suffixes.
来源:https://stackoverflow.com/questions/10681281/need-explanation-on-language-stemmer-of-solr