Indexing multi-lingual content with Lucene.net

爱⌒轻易说出口 提交于 2019-12-05 13:14:00

I do [2], but one problem I have is that I cannot use different analyzers depending on the language. I've combined the stopwords of the languages I want, but I lose the capability of more advanced stuff that the analyzer will offer such as stemming etc.

You can eliminate option 1 and 2.
You can use one index and the fields that contains arabic words create two fileds for each: If you have field "Text" might contain arabic or english contents ==>

  • Create 2 fields for "Text" : 1 field, "Text", indexed/searched with your standard analyzer and another one, "Text_AR" , with the arabicAnalyzer. In order to achieve that you can use PreFieldAnalyzerWrapper
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!