Does the Sitecore 7 ContentSearch API remove stop words from queries?

I've found that searches that contain 'of', 'and', 'the', etc. will not return results because Lucene has removed stop words. So if I search for a item that had a title of "Aftermath of the first world war" I will get zero results.

But if I strip 'of' and 'the', then I am searching for "aftermath first world war". I will get the expected document back.

Does the ContentSearch API remove stop words from queries? Is this something one can configure Lucene to remove? Or should I remove these stop words before building my query?

Thanks Adam


You can configure Sitecore Standard Analyzer to accept your own custom set of Stopwords. Create an text file with the stopwords (one stop word per line) and then Make the below config changes in the Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config file

<param desc="defaultAnalyzer" type="Sitecore.ContentSearch.LuceneProvider.Analyzers.DefaultPerFieldAnalyzer, Sitecore.ContentSearch.LuceneProvider">
  <param desc="defaultAnalyzer" type="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net">
    <param hint="version">Lucene_30</param>
      <param desc="stopWords" type="System.IO.FileInfo, mscorlib">
      <param hint="fileName">[FULL_PATH_TO_SITECORE_ROOT_FOLDER]Dataindexesstopwords.txt</param>
      </param>
  </param>
</param>   

Further Reading : I have written an blog post about this issue and might be of help http://blog.horizontalintegration.com/2014/03/19/sitecore-standard-analyzer-managing-you-own-stop-words-filter/


I think this is the same problem with problem from this blog.

Can you try to follow the steps from the blog post?

Other option can be to create a custom analyzer and to give to the constructor your stopWords list. Something like:

public class CustomAnalyzer : Lucene.Net.Analysis.Standard.StandardAnalyzer
{
    private static Hashtable stopWords = new Hashtable()
    {
        {"of", "of"},
        {"stopword2", "stopword2"}
    }; 
    public CustomAnalyzer() : base(Lucene.Net.Util.Version.LUCENE_30, stopWords)
    {      
    }
}

After you modify you need to change your config file. A nice blog post about Analyzer you can find here. PS: I didn't try my code if is really working.

链接地址: http://www.djcxy.com/p/18894.html

上一篇: 通过Web界面在Gitlab repo中创建一个目录

下一篇: Sitecore 7 ContentSearch API是否删除查询中的停用词?