lucene | 易学教程

Using Solr and Zends Lucene port together

阅读更多关于 Using Solr and Zends Lucene port together

问题 Afternoon chaps, After my adventures with Zend-Lucene-Search, and discovering it isn't all its cracked up to be when indexing large datasets, I've turned to Solr (thanks to Bill Karwin for that :) ) I've got Solr indexing the db far far quicker now, taking just over 8 minutes to index a table of just over 1.7million rows - which I'm very pleased with. However, when I come to try and search the index with the Zend port, I run into the following error; Fatal error: Uncaught exception 'Zend

Adding Boost to Score According to Payload of Multivalued Field at Solr

阅读更多关于 Adding Boost to Score According to Payload of Multivalued Field at Solr

问题 Here is my case; I have a field at my schema named elmo_field . I want that elmo_field should have payloaded values. i.e. dorothy|0.46 sesame|0.37 big bird|0.19 bird|0.22 When a user searches for a keyword i.e. dorothy I want to add 0.46 to usual score. If user searches for big bird , 0.19 should be added and if user searches for bird , 0.22 should be added (payloads are added - or payloads * normalize coefficient will be added). I mean I will make a search on my index at my other fields of

create new core directories in SOLR on the fly

阅读更多关于 create new core directories in SOLR on the fly

问题 i am using solr 1.4.1 for building a distributed search engine, but i dont want to use only one index file - i want to create new core "index"-directories on the fly in my java code. i found following rest api to create new cores using an EXISTING core directory (http://wiki.apache.org/solr/CoreAdmin). http://localhost:8983/solr/admin/cores?action=CREATE&name=coreX&instanceDir=path_to_instance_directory&config=config_file_name.xml&schema=schem_file_name.xml&dataDir=data is there a way to

Example using WikipediaTokenizer in Lucene

阅读更多关于 Example using WikipediaTokenizer in Lucene

问题 I want to use WikipediaTokenizer in lucene project - http://lucene.apache.org/java/3_0_2/api/contrib-wikipedia/org/apache/lucene/wikipedia/analysis/WikipediaTokenizer.html But I never used lucene. I just want to convert a wikipedia string into a list of tokens. But, I see that there are only four methods available in this class, end, incrementToken, reset, reset(reader). Can someone point me to an example to use it. Thank you. 回答1: In Lucene 3.0, next() method is removed. Now you should use

Solr suggester in SolrCloud mode

阅读更多关于 Solr suggester in SolrCloud mode

问题 I am running the solr in CloudSolr mode with three shards. The data is already indexed into solr. Now I have configured the solr suggester in solrconfig.xml. This is the configuration from solrconfig file. I am using solr 4.10 version. <searchComponent name="suggest" class="solr.SuggestComponent"> <lst name="suggester"> <str name="name">mysuggest</str> <str name="lookupImpl">FuzzyLookupFactory</str> <str name="storeDir">suggester_fuzzy_dir</str> <str name="dictionaryImpl"

Solr suggester in SolrCloud mode

阅读更多关于 Solr suggester in SolrCloud mode

蚂蚁金服 ZSearch 在向量检索上的探索

阅读更多关于蚂蚁金服 ZSearch 在向量检索上的探索

【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 图为 ZSearch 基础架构负责人十倍 2019 Elastic Dev Day 现场分享引言 ElasticSearch（简称 ES）是一个非常受欢迎的分布式全文检索系统，常用于数据分析，搜索，多维过滤等场景。蚂蚁金服从2017年开始向内部业务方提供 ElasticSearch 服务，我们在蚂蚁金服的金融级场景下，总结了不少经验，此次主要给大家分享我们在向量检索上的探索。 ElasticSearch 的痛点 ElasticSearch 广泛应用于蚂蚁金服内部的日志分析、多维分析、搜索等场景。当我们的 ElasticSearch 集群越来越多，用户场景越来越丰富，我们会面临越来越多的痛点：如何管理集群；如何方便用户接入和管理用户；如何支持用户不同的个性化需求； ... 为了解决这些痛点，我们开发了 ZSearch 通用搜索平台：基于 K8s 底座，快速创建 ZSearch 组件，快捷运维，故障机自动替换；跨机房复制，重要业务方高保；插件平台，用户自定义插件热加载； SmartSearch 简化用户搜索，开箱即用； Router 配合 ES 内部多租户插件，提高资源利用率；向量检索需求基于 ElasticSearch 的通用搜索平台 ZSearch 日趋完善，用户越来越多，场景更加丰富。

lucene phrase query not working

阅读更多关于 lucene phrase query not working

问题 I am trying to write a simple program using Lucene 2.9.4 which searches for a phrase query but I am getting 0 hits public class HelloLucene { public static void main(String[] args) throws IOException, ParseException{ // TODO Auto-generated method stub StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_29); Directory index = new RAMDirectory(); IndexWriter w = new IndexWriter(index,analyzer,true,IndexWriter.MaxFieldLength.UNLIMITED); addDoc(w, "Lucene in Action"); addDoc(w,

How to match exact text in Lucene search?

阅读更多关于 How to match exact text in Lucene search?

问题 Im trying to match a text Config migration from ASA5505 8.2 to ASA5516 in column TITLE . My program looks like this. Directory directory = FSDirectory.open(indexDir); MultiFieldQueryParser queryParser = new MultiFieldQueryParser(Version.LUCENE_35,new String[] {"TITLE"}, new StandardAnalyzer(Version.LUCENE_35)); IndexReader reader = IndexReader.open(directory); IndexSearcher searcher = new IndexSearcher(reader); queryParser.setPhraseSlop(0); queryParser.setLowercaseExpandedTerms(true); Query

Java threads slow down towards the end of processing

阅读更多关于 Java threads slow down towards the end of processing

问题 I have a Java program that takes in a text file containing a list of text files and processes each line separately. To speed up the processing, I make use of threads using an ExecutorService with a FixedThreadPool with 24 threads. The machine has 24 cores and 48GB of RAM. The text file that I'm processing has 2.5 million lines. I find that for the first 2.3 million lines or so things run very well with high CPU utilization. However, beyond some point (at around the 2.3 lines), the performance