stop words in sitecore

流过昼夜 提交于 2019-12-12 08:37:32

问题


We are using Lucene for text search as part of sitecore. Is there any method to ignore stop words (like a,an,the...) in the sitecore search?


回答1:


By default, Sitecore uses Lucene standard analyzer - Lucene.Net.Analysis.Standard.StandardAnalyzer. You can see this is defined in /configuration/sitecore/search/analyzer element of web.config file. One of the constructors of StandardAnalyzer class accepts the array of strings it will consider stop words. By default it uses the hardcoded list of stop words which include:

"a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"

If you'd like to override this behavior, I think you should inherit StandardAnalyzer and override its default constructor to take the stop words from another source instead of the hardcoded array. You have various options, even reading it from a text file. Don't forget to replace the standard class with yours in web.config.

See other constructors of StandardAnalyzer class for more details. .NET Reflector is your friend here.




回答2:


An example for Yans post:

public class CaseAnalyzer : Lucene.Net.Analysis.Standard.StandardAnalyzer
{
   private static Hashtable stopWords = new Hashtable(); //{{"by","by"}}; <-- Makes "by" a stopword that will not be matched in analyzer
   public CaseAnalyzer() : base(Lucene.Net.Util.Version.LUCENE_29, stopWords)
   {      
   }
}

this should be registered in the web.config under

/configuration/sitecore/search/analyzer

an example of the analyzer registration

<caseanalyzer type="EBF.Business.Search.Analyzers.CaseAnalyzer, EBF.Business, Version=1.0.0.0, Culture=neutral"/>

Lastly you just need to register your analyzer in the search configuration like this

<Analyzer ref="search/caseanalyzer" />


来源:https://stackoverflow.com/questions/4871709/stop-words-in-sitecore

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!