lucene encoding problem in zend framework

不羁的心 提交于 2019-12-13 17:43:09

问题


i use of lucene search indexer .

it work nice for english language, but i use of persian in my site and it can`t index for this language

for example "سلام"

i use of this code for create document:

public function __construct($class, $key, $title,$contents, $summary, $createdBy, $dateCreated)
    {
        $this->addField(Zend_Search_Lucene_Field::Keyword('docRef', "$class:$key"));
        $this->addField(Zend_Search_Lucene_Field::UnIndexed('class', $class));
        $this->addField(Zend_Search_Lucene_Field::UnIndexed('key', $key));
        $this->addField(Zend_Search_Lucene_Field::Keyword('title', $title ,'utf-8'));
        $this->addField(Zend_Search_Lucene_Field::unStored('contents', $contents , 'UTF-8'));
        $this->addField(Zend_Search_Lucene_Field::text('summary', $summary , 'UTF-8'));
        $this->addField(Zend_Search_Lucene_Field::Keyword('dateCreated', $dateCreated));
    }

回答1:


Add this (best place bootstrap)

    Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
    Zend_Search_Lucene_Analysis_Analyzer::setDefault(
        new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive ()
    );



回答2:


I was having the same problem reported by @afsane and then I tried the solution provided by @ArneRie. It did solve my problem though after some testing I realized the first line was not needed (at least in my current setup).

So the solution that worked for me was to explicitly set the default analyzer before creating my index:

Zend_Search_Lucene_Analysis_Analyzer::setDefault(
    new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive());
$index = Zend_Search_Lucene::create('/path/to/my/index');

I did not need to explicitly set the default analyzer before opening the index for querying though.



来源:https://stackoverflow.com/questions/5834861/lucene-encoding-problem-in-zend-framework

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!