lucene | 易学教程

Can we create Lucene indexes only once at for initial set up only?

阅读更多关于 Can we create Lucene indexes only once at for initial set up only?

问题 I am an newbie in Hibernate Search . According to Documentation By default, every time an object is inserted, updated or deleted through Hibernate, Hibernate Search updates the according Lucene index. As I learn till now I come to know that we can build the lucene index through programmatically like this :( correct me if I am wrong) FullTextSession fullTextSession = Search.getFullTextSession(session); fullTextSession.createIndexer().startAndWait(); But what makes me surprise if is there is

Lucene encoding, java

阅读更多关于 Lucene encoding, java

问题 I have questions about encoding in Lucene (java). How is working with coding in Lucene? which is the default and how can I set it? Or Lucene does not matter what it is encoding and it's just a matter of how adding a string to a document (java code is below) in the indexing phase, and then in the search in the index? In other words, I have to worry if the input text is in UTF-8 and query are also in utf-8? Document doc = new Document (); doc.add (new TextField (tagName, object.getName () Field

Search in Single-Token-Field using Lucene.NET

阅读更多关于 Search in Single-Token-Field using Lucene.NET

问题 I´m using Lucene.NET 3.0.3 for indexing the content of word-, excel-, etc. documents and some custom fields for each document. If I index a field named "title" as Field.Index.NOT_ANALYZED the Lucene-Index stored the field in correct form. The hole title is stored in a single token. That´s what I want. e.g. title of document is "Lorem ipsum dolor" field in Lucene-index: "Lorem ipsum dolor" If I search using exact search in this field I get no results. My searchterm looks like: title:"Lorem

Elastic search Not filter inside and filter

阅读更多关于 Elastic search Not filter inside and filter

问题 I am trying to add a "not" filter inside "and" filter Sample input: { "query":{ "filtered":{ "query":{ "query_string":{ "query":"error", "fields":[ "request" ] } }, "filter":{ and:[ { "terms":{ "hashtag":[ "br2" ] }, "not":{ "terms":{ "hashtag":[ "br1" ] } } } ] } } } }, } But above is giving error, i also tried various combination but in vain. Above is just an example in short i require a query in which both "and", "not" filter are present. 回答1: you forgot the "filters" array. Write it like

How to serialize/deserialize a map with Solr/Lucene?

阅读更多关于 How to serialize/deserialize a map with Solr/Lucene?

问题 I am new to solr, and I am facing a problem when I try to serialize/deserialize a Map in Solr. I use Spring Data Solr in my Java application as follow: @Field("mapped_*") private Map<String, String> values; It flatten and serializes my map in Solr as follow: "key1" : "value1" "key2" : "value2" ... However, when I run a search, the returned objects have this field always set as NULL. Deserialization does not work on this particular field, it looks like it does not recognize the key1, key2...

Lucene Query to match multiple words like mysql's “LIKE %my string%”

阅读更多关于 Lucene Query to match multiple words like mysql's “LIKE %my string%”

问题 I'm trying to match multiple words in Lucene as I could do in MySQL. It's harder than I thought: written in PHP: my query for perfect match is: $words = explode($words, " "); (text:(' . implode(" ", $words) . ') but if text is "a bunch of words I wrote", it won't match until I have written everything Does exist any way to force Lucene to behave exactly like MySQL's like "%a bunc%" and retrieve the hole phrase? Thanks in advance EDIT: I'm not using Lucene directly, I use Solr as a REST service

Lucene.NET - checking if document exists in index

阅读更多关于 Lucene.NET - checking if document exists in index

问题 I have the following code, using Lucene.NET V4, to check if a file exists in my index. bool exists = false; IndexReader reader = IndexReader.Open(Lucene.Net.Store.FSDirectory.Open(lucenePath), false); Term term = new Term("filepath", "\\myFile.PDF"); TermDocs docs = reader.TermDocs(term); if (docs.Next()) { exists = true; } The file myFile.PDF definitely exists, but it always comes back as false . When I look at docs in debug, its Doc and Freq properties state that they "threw an exception of

Lucene index and Windows DFS replication

阅读更多关于 Lucene index and Windows DFS replication

问题 I want to replicate Lucene index on my web servers periodically. Apart from Solr, can I setup DFS replication on my Windows 2008 servers and use that to replicate my indexes over my load balanced web servers? Will that approach work or I will have to write parallely to 2 different indexer locations within my crawler code? Any help is appreciated. Thanks! 回答1: I'm not exactly sure of your question. You cannot have two writers writing to the same location no matter what file system you use. So

Getting avg sub aggregation

阅读更多关于 Getting avg sub aggregation

问题 I'd like to get the avg of a sub aggregation. For example, i have daily profit of each branch. I want to sum them so that i can get total daily profit. and then i want to get the monthly or week average of that daily profit. So far i have done this { "size" : 0, "aggs" : { "group_by_month": { "date_histogram": { "field": "Profit_Day", "interval": "month", "format" : "MM-yyyy" }, "aggs": { "avgProf": { "avg": { "field": "ProfitValue" } }, "group_by_day": { "date_histogram": { "field": "Profit

Indexing Multilevel JSON Objects in Lucene

阅读更多关于 Indexing Multilevel JSON Objects in Lucene

问题 I am new to Lucene. I have worked on Lucene search using field value pairs in documents. Now there is a requirement to parse some JSON files and Index them up for Lucene search. I have an idea on working with simple form of JSON file according to this article. But the JSON Structure I have to work with is little more complex than that. Any kind of idea will be appreciated. Thanks. 回答1: You basically can linearize your json and then index it as in article you've provided. For instance, json