lucene | 易学教程

How to customize Lucene.NET to search for words with symbols without case-sensitivity (e.g. “C#” or “.net”)?

阅读更多关于 How to customize Lucene.NET to search for words with symbols without case-sensitivity (e.g. “C#” or “.net”)?

问题 The standard analyzer does not work. From what I can understand, it changes this to a search for c and net The WhitespaceAnalyzer would work but it's case sensitive. The general rule is search should work like Google so hoping it's a configuration thing considering .net , c# have been out there for a while or there's a workaround for this. Per the suggestions below, I tried the custom WhitespaceAnalyzer but then if the keywords are separated by a comma and no-space are not handled correctly e

Search in solr with special characters

阅读更多关于 Search in solr with special characters

问题 I have a problem with a search with special characters in solr. My document has a field "title" and sometimes it can be like "Titanic - 1999" (it has the character "-"). When i try to search in solr with "-" i receive a 400 error. I've tried to escape the character, so I tried something like "-" and "\-". With that changes solr doesn't response me with an error, but it returns 0 results. How can i search in the solr admin with that special character(something like "-" or "'"??? Regards UPDATE

Solr MoreLikeThis boosting query fields

阅读更多关于 Solr MoreLikeThis boosting query fields

问题 I am experimenting with Solr's MoreLikeThis feature. My schema deals with articles, and I'm looking for similarities between articles within three fields: articletitle, articletext and topic. The following query works well: q=id:(2e2ec74c-7c26-49c9-b359-31a11ea50453) &rows=100000000&mlt=true &mlt.fl=articletext,articletitle,topic&mlt.boost=true&mlt.mindf=1&mlt.mintf=1 But I would like to experiment with boosting different query fields - i.e. putting more weight on similarities in the

How can i get unique suggestions without duplicates when i use completion suggester?

阅读更多关于 How can i get unique suggestions without duplicates when i use completion suggester?

问题 I am using elastic 5.1.1 in my environment. I have chosen completion suggester on a field name post_hashtags with an array of strings to have suggestion on it. I am getting response as below for prefix "inv" Req: POST hashtag/_search?pretty&&filter_path=suggest.hash-suggest.options.text,suggest.hash-suggest.options._source {"_source":["post_hashtags" ], "suggest": { "hash-suggest" : { "prefix" : "inv", "completion" : { "field" : "post_hashtags" } } } Response : { "suggest": { "hash-suggest":

Finding the number of documents in a lucene index

阅读更多关于 Finding the number of documents in a lucene index

问题 Using Java how would you find out the number of documents in an lucene index? 回答1: IndexReader contains the methods you need, in particular, numDocs http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexReader.html#numDocs() 回答2: The official documentation: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/IndexReader.html#numDocs() 回答3: Using java you can find the number of documents like this : IndexReader reader = IndexReader.open(FSDirectory.open

How to open a Lucene 4.3 index?

阅读更多关于 How to open a Lucene 4.3 index?

问题 I am a Lucene newbie and I am trying to open a Lucene 4.3 index (which I am creating with my simple Lucene 4.3.1 app) using Luke, but it keeps giving me: Invalid directory at the location, check console for more information. Last exception: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names:

Lucene filter with docIds

阅读更多关于 Lucene filter with docIds

问题 I'm trying to do the following: I want to create a set of candidates by querying each field separately and then adding the top k matches to this set. After I'm done with that, I need to run another query on this candidate set. The way how I implemented it right now is using a QueryWrapperFilter with a BooleanQuery that matches the unique id field of each candidate document. However, this means I have to call IndexSearcher.doc().get("docId") for each candidate document before I can add it to

Query in Lucene

阅读更多关于 Query in Lucene

问题 The structure of the table "testtable" is id int primary key productid int attributeid int value varchar(250) where productid is the unique id of a product, attributeid is the unique id of attribute of a product e.g. size, quality,height, color and 'value' is the value for the attribute i have to filter a result. I achieve the requirement by this query. But i am not able to make it in a query. select a.* from dbo.testtable a where a.attributeId=10 and a.[Value]='Romance' and productId in (

Sphinx/Solr/Lucene/Elastic Relevancy

阅读更多关于 Sphinx/Solr/Lucene/Elastic Relevancy

问题 We have an extremely large database of 30+ Million products, and need to query them to create search results and ad displays thousands of times a second. We have been looking into Sphinx, Solr, Lucene, and Elastic as options to perform these constant massive searches. Here's what we need to do. Take keywords and run them through the database to find products that match the closest. We're going to be using our OWN algorithm to decide which products are most related to target our advertisements

Lucene 3.0.3 Numeric term query

阅读更多关于 Lucene 3.0.3 Numeric term query

问题 I have a numeric field in Lucene 3.0.3 and it works perfectly fine with the range queries. If we switch to the TermQuery it doesnt produce any result. For example: Document doc = new Document(); String name = "geolongitude"; NumericField numericField = new NumericField(name); double value = 29.0753505; String valueAsString = "29.0753505"; numericField.setDoubleValue(value); doc.add(numericField); indexWriter.addDocument(doc); indexWriter.commit(); indexWriter.close(); IndexSearcher