lucene

Java, Lucene : Set lock timeout for IndexWriter in Java.

馋奶兔 提交于 2020-01-24 08:59:25
问题 I am working on integrating Lucene with our Spring-MVC based application. Currently we have it working, but rarely we get a cannot obtain lock error. After which I have to manually delete the lock file and then it works normally. How can I set a timeout for Locking the index in Java? I don't have any XML configuration for Lucene. I added the project library in maven via POM.xml and instantiated the required classes. Code : public void saveIndexes(String text, String tagFileName, String

How to sort by Lucene.Net field and ignore common stop words such as 'a' and 'the'?

≯℡__Kan透↙ 提交于 2020-01-23 19:26:27
问题 I've found how to sort query results by a given field in a Lucene.Net index instead of by score; all it takes is a field that is indexed but not tokenized. However, what I haven't been able to figure out is how to sort that field while ignoring stop words such as "a" and "the", so that the following book titles, for example, would sort in ascending order like so: The Cat in the Hat Horton Hears a Who Is such a thing possible, and if yes, how? I'm using Lucene.Net 2.3.1.2. 回答1: I wrap the

How to define a primary key field in a Lucene document to get the best lookup performance?

余生长醉 提交于 2020-01-23 07:52:40
问题 When creating a document in my Lucene index (v7.2), I add a uid field to it which contains a unique id/key (string): doc.add(new StringField("uid",uid,Field.Store.YES)) To retrieve that document later on, I create a TermQuery for the given unique id and search for it with an IndexSearcher: searcher.search(new TermQuery(new Term("uid",uid)),1) Being a Lucene "novice", I would like to know the following: How should I improve this approach to get the best lookup performance? Would it, for

How to define a primary key field in a Lucene document to get the best lookup performance?

那年仲夏 提交于 2020-01-23 07:52:36
问题 When creating a document in my Lucene index (v7.2), I add a uid field to it which contains a unique id/key (string): doc.add(new StringField("uid",uid,Field.Store.YES)) To retrieve that document later on, I create a TermQuery for the given unique id and search for it with an IndexSearcher: searcher.search(new TermQuery(new Term("uid",uid)),1) Being a Lucene "novice", I would like to know the following: How should I improve this approach to get the best lookup performance? Would it, for

Lucene 3.0 原理与代码分析

最后都变了- 提交于 2020-01-23 07:43:58
本系列文章将详细描述几乎最新版本的Lucene的基本原理和代码分析。 其中总体架构和索引文件格式是Lucene 2.9的,索引过程分析是Lucene 3.0的。 鉴于索引文件格式没有太大变化,因而原文没有更新,原理和架构的文章中引用了前辈的一些图,可能属于早期的Lucene,但不影响对原理和架构的理解。 本系列文章尚在撰写之中,将会有分词器,段合并,QueryParser,查询语句与查询对象,搜索过程,打分公式的推导等章节。 提前给大家分享,希望大家批评指正。 Lucene学习总结之一:全文检索的基本原理 http://www.cnblogs.com/forfuture1978/archive/2009/12/14/1623594.html Lucene学习总结之二:Lucene的总体架构 http://www.cnblogs.com/forfuture1978/archive/2009/12/14/1623596.html Lucene学习总结之三:Lucene的索引文件格式(1) http://www.cnblogs.com/forfuture1978/archive/2009/12/14/1623597.html Lucene学习总结之三:Lucene的索引文件格式(2) http://www.cnblogs.com/forfuture1978/archive/2009/12

Elasticsearch/Lucene highlight

依然范特西╮ 提交于 2020-01-23 03:31:06
问题 How to highlight result query with fuzzyLikeThisFieldQuery in elasticsearch? I can pick up on fuzzyQuery but not fuzzyLikeThisFieldQuery. For example, in the code below i used fuzzyQuery: QueryBuilder allquery = QueryBuilders.fuzzyQuery("name", "fooobar").minSimilarity(0.4f); SearchRequestBuilder builder = ds.getElasticClient() .prepareSearch("data") .setQuery(allquery) .setFrom(0) .setSize(10) .setTypes("entity") .setSearchType(SearchType.DEFAULT) .addHighlightedField("name") .addField("name

Getting maximum value of field in solr

旧街凉风 提交于 2020-01-22 19:43:35
问题 I'd like to boost my query by the item's view count; I'd like to use something like view_count / max_view_count for this purpose, to be able to measure how the item's view count relates to the biggest view count in the index. I know how to boost the results with a function query, but how can I easily get the maximum view count? If anybody could provide an example it would be very helpful... 回答1: There aren't any aggregate functions under solr in the way you might be thinking about them from

Getting maximum value of field in solr

冷暖自知 提交于 2020-01-22 19:42:06
问题 I'd like to boost my query by the item's view count; I'd like to use something like view_count / max_view_count for this purpose, to be able to measure how the item's view count relates to the biggest view count in the index. I know how to boost the results with a function query, but how can I easily get the maximum view count? If anybody could provide an example it would be very helpful... 回答1: There aren't any aggregate functions under solr in the way you might be thinking about them from

Efficient substring search in a large text file containing 100 millions strings(no duplicate string)

两盒软妹~` 提交于 2020-01-22 12:53:49
问题 I have a large text file(1.5 Gb) having 100 millions Strings(no duplicate String) and all the Strings are arranged line by line in the file . i want to make a wepapplication in java so that when user give a keyword(Substring) he get the count of All the strings present in the file which contains that keyword. i know one technique LUCENE already..is there any other way to do this.?? i want the result within 3-4 seconds. MY SYSTEM HAS 4GB RAM AND DUAL CORE configuration.... need to do this in

Efficient substring search in a large text file containing 100 millions strings(no duplicate string)

十年热恋 提交于 2020-01-22 12:53:25
问题 I have a large text file(1.5 Gb) having 100 millions Strings(no duplicate String) and all the Strings are arranged line by line in the file . i want to make a wepapplication in java so that when user give a keyword(Substring) he get the count of All the strings present in the file which contains that keyword. i know one technique LUCENE already..is there any other way to do this.?? i want the result within 3-4 seconds. MY SYSTEM HAS 4GB RAM AND DUAL CORE configuration.... need to do this in