lucene | 易学教程

Zend Lucene and range search on a field with multiple values

阅读更多关于 Zend Lucene and range search on a field with multiple values

问题 Say my index contains documents with a field called "ages". Example entries for "ages" field: 25 24, 28 25, 31 How would I query this so that I get all the documents whose fields contain ages between 20 and 30? I'm using Zend Lucene. 回答1: lmgtfy :- $from = new Zend_Search_Lucene_Index_Term(20, 'ages'); $to = new Zend_Search_Lucene_Index_Term(30, 'ages'); $query = new Zend_Search_Lucene_Search_Query_Range( $from, $to, true // inclusive ); $hits = $index->find($query); Docs:- http://framework

Zend Lucene: Fatal Error, Maximum Execution Time

阅读更多关于 Zend Lucene: Fatal Error, Maximum Execution Time

问题 I've written a basic indexing script for my site and it seems to be working...somewhat. It gets through about 3/4 of the pages it needs to index and then give this error: Fatal error: Maximum execution time of 0 seconds exceeded in /Zend/Search/Lucene/Analysis/Analyzer.php on line 166 It seems to hang up in a different spot each time, too. I ran it a minute later and got this: Fatal error: Maximum execution time of 0 seconds exceeded in /Zend/Search/Lucene/Storage/Directory/Filesystem.php on

SOLR Dropping Emoji Miscellaneous characters

阅读更多关于 SOLR Dropping Emoji Miscellaneous characters

问题 It looks like SOLR is considering what should be valid Unicode characters as invalid, and dropping them. I "proved" this by turning on query debug to see what the parser was doing with my query. Here's an example: Query = 'ァ☀' (\u30a1\u2600) Here's what SOLR did with it: 'debug':{ 'rawquerystring':u'\u30a1\u2600', 'querystring':u'\u30a1\u2600', 'parsedquery':u'(+DisjunctionMaxQuery((text:\u30a1)))/no_coord', 'parsedquery_toString':u'+(text:\u30a1)', As you can see, was OK with 'ァ', but it ATE

How to search only inside one string of a Collection in Azure Search?

阅读更多关于 How to search only inside one string of a Collection in Azure Search?

问题 I've a collection fields like: ["city of god"] ["god of war", "city of war"] I want to perform a search on the field with 'city' AND 'god' and I want only 'city of god' to be returned. Yet, the second field is also return regardless of the terms being in two different strings within the collection. Anyway to make the search strict to within strings and not to the entire collection? 回答1: Each searchable field in the index is treated as a bag of terms, so for “city AND god” you’re matching on

Eclipse-Pydev cannot find Lucene Library

阅读更多关于 Eclipse-Pydev cannot find Lucene Library

问题 I have been developing a Python program using the Pydev(2.5.0) plugin in Eclipse Helios on Ubuntu OS 11.4. The program uses lucene (core 3.6) library. Lucene was installed using jcc. Previously I developed it with a text editor and ran on the command line using python xxx.py and there was no problem regarding lucene libraries. Then, I imported the project to Eclipse IDE. The other source files still run as-is, but the program cannot locate the basic classes of lucene library. import lucene #

Finding sum of average sub aggregations

阅读更多关于 Finding sum of average sub aggregations

问题 I'd like to get the sum of a sub aggregation. For example, I'm grouping by smartphones, then by carrier, and then I'm finding the average price of each carrier for that particular smartphone. I'd like to get the sum of the average prices for all carriers for each smartphone. So essentially, I want something like this: { "aggs": { "group_by_smartphones": { "terms": { "field": "smartphone", "order": { "_term": "asc" }, "size": 200 }, "aggs": { "group_by_carrier": { "terms": { "field": "carrier"

Term frequency scoring in lucene 5.3

阅读更多关于 Term frequency scoring in lucene 5.3

问题 I want to use only the term frequency to rank the results in Apache Lucene 5.3. I tried overriding the DefaultSimilarity class, but it seems it is not working in Lucene 5.3. I am using the following code: import org.apache.lucene.search.similarities.DefaultSimilarity; public class TfSimilarity extends DefaultSimilarity { public TfSimilarity(){} public float idf(int docFreq, int numDocs) { return(float)1.0; } public float coord(int overlap, int maxOverlap) { return 1.0f; } public float

Lucene term boosting with sunspot-rails

阅读更多关于 Lucene term boosting with sunspot-rails

问题 I'm having an issue with Lucene's Term [Boosting][1] query syntax, specifically in Ruby on Rails via the sunspot_rails gem. This is whereby you can specify the weight of a specific term during a query, and is not related to the weighting of a particular field. The HTML query generated by sunspot uses the qf parameter to specify the fields to be searched as configured, and the q parameter for the query itself. When the caret is added to a search term to specify a boost (i.e. q=searchterm^5) it

docker安装solr

阅读更多关于 docker安装solr

【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> Solr是一个独立的企业级搜索应用服务器,solr是以lucene为内核开发的企业级搜索应用应用程序可以通过http请求方式来提交索引，查询索引，提供了比lucene更丰富的查询语言，是一个高性能，高可用环境全文搜索引擎查看solr版本 docker search solr 下载solr(注意版本) docker pull solr:5.5.5 下载镜像成功然后进入下一步安装solr 在页面可以看到该命令,该命令使用的是端口映射但是我要使用仅主机模式所以输入命令 docker run --name my_solr -idt --net host solr:5.5.5 完成后输入查看容器命令: docker ps -a 得到下图表示已经在后台运行它会默认开辟一个8983的端口创建core : docker exec -it --user=solr my_solr bin/solr create_core -c mycore 命令解析: --user=solr 用默认启动容器自动创建solr用户执行命令 -c mycore -c=命名,mycore=名称也可以用这种http网页创建(比较底层的东西) http://localhost:8983/solr/admin/cores?action=CREATE

assigning different weights to different query terms in lucene

阅读更多关于 assigning different weights to different query terms in lucene

问题 I'm very new to lucene and wants to do the following. Suppose my query is, query = "apple growers fruit ipad mac" ,but I want to give different weights to these query terms like, query = "apple (0.2) growers (0.7) fruit (0.9) ipad (0.05) mac (0.06) , the intuition is that i want to rank the documents that talks about apple in the sense of agriculture higher than those of which about tech. I have seen here (How to assign a weight to a term query in Lucene/Solr), that you can use Query.setBoost