Need explanation on Language Stemmer of Solr

余生长醉 提交于 2019-12-12 03:12:29

问题


I'm using nutch with Solr for a developing a search engine for Arabic texts. I need to implement a stemmer on my Arabic texts, and while serching on Solr Stemmer I found that it provide those two filters

<filter class="solr.ArabicNormalizationFilterFactory"/>

<filter class="solr.ArabicStemFilterFactory"/>

I tried them but did not understand what they do .. So please any one can help me with some examples??

and do these two do this:

العملات Stemmed to عملة

البسَاتِين ، بساتينكم Stemmed to بستان

thank you.


回答1:


You can find some details here: http://lucene.apache.org/core/3_6_0/api/contrib-analyzers/org/apache/lucene/analysis/ar/ArabicStemmer.html

That says:

Stemming is defined as:

  • Removal of attached definite article, conjunction, and prepositions.
  • Stemming of common suffixes.


来源:https://stackoverflow.com/questions/10681281/need-explanation-on-language-stemmer-of-solr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!