lucene.net

What are some good resources on using Lucene.Net? [closed]

只谈情不闲聊 提交于 2019-12-03 00:49:41
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . Does anyone know where I can find out more information on Lucene.Net? I am looking for a tutorial or videos on how to use Lucene.Net that stack overflow users can personally recommend. 回答1: There are some great articles on CodeProject: http://www.codeproject.com/KB/library/IntroducingLucene.aspx http://www

Avoid removal of current Lucene.NET index during rebuild

偶尔善良 提交于 2019-12-02 23:34:45
I'm new to Lucene.NET but I'm using an open source tool built for Sitecore CMS that uses Lucene.NET to index lots of content from the CMS. I confirmed yesterday that when I rebuild my indexes, the current index files wipe clean so anything that relies on the index gets no data for about 30-60 seconds (the amount of time for a full index rebuild). Is there a best practice or way to make Lucene.NET not overwrite the current index files until the new index is completely rebuilt? I'm basically thinking I'd like it to write to new temp index files and when the rebuild is done have those files

How might I index PDF files using Lucene.Net?

和自甴很熟 提交于 2019-12-02 21:27:27
I'm looking for some sample code demonstrating how to index PDF documents using Lucene.Net and C#. Google turned up a few, but none that I could find helpful. From my understanding, Lucene is limited to creating an index and searching that index. It's up to the application to handle opening files and extracting their contents for the index. So if you're looking to search PDF documents you'll want to use something like iTextSharp to open the file, pull out the contents, and pass it to Lucene for indexing. There are some good starting examples of using Lucene on the Dimecasts.net website.

Lucene Standard Analyzer vs Snowball

隐身守侯 提交于 2019-12-02 20:30:14
Just getting started with Lucene.Net. I indexed 100,000 rows using standard analyzer, ran some test queries, and noticed plural queries don't return results if the original term was singular. I understand snowball analyzer adds stemming support, which sounds nice. However, I'm wondering if there are any drawbacks to gong with snowball over standard? Am I losing anything by going with it? Are there any other analyzers out there to consider? Yes, by using a stemmer such as Snowball, you are losing information about the original form of your text. Sometimes this will be useful, sometimes not. For

How to make the Lucene QueryParser more forgiving?

て烟熏妆下的殇ゞ 提交于 2019-12-02 20:15:30
I'm using Lucene.net, but I am tagging this question for both .NET and Java versions because the API is the same and I'm hoping there are solutions on both platforms. I'm sure other people have addressed this issue, but I haven't been able to find any good discussions or examples. By default, Lucene is very picky about query syntax. For example, I just got the following error: [ParseException: Cannot parse 'hi there!': Encountered "<EOF>" at line 1, column 9. Was expecting one of: "(" ... "*" ... <QUOTED> ... <TERM> ... <PREFIXTERM> ... <WILDTERM> ... "[" ... "{" ... <NUMBER> ... ] Lucene.Net

Using BooleanQuery or write more indexes?

烂漫一生 提交于 2019-12-02 19:37:53
问题 A category tree like this: root_1 sub_1 sub_2 ... to sub_20 Every document has a sub category(like sub_2 ). Now, I only wrote sub_2 in lucene index: new NumericField("category",...).setIntValue(sub_2.getID()); I want to get all root_1 's documents, using BooleanQuery (merge the sub_1 to sub_20 ) to search or write an other category in every entry document: new NumericField("category",...).setIntValue(sub_2.getID()); new NumericField("category",...).setIntValue(root_1.getID());//sub_2's

Lucene.Net Best Practices

筅森魡賤 提交于 2019-12-02 14:30:20
What are the best practices in using Lucene.Net? or where can I find a good lucene.net usage sample? Razzie If you're going to work with Lucene, I'd buy a good book that covers it from A to Z. Lucene has a very steep learning curve (in my opinion). It's not only knowing how to search your that's important - it's also about indexing it. Doing a basic search is easy, but creating an index that consists of millions of records of data and still being able to do a lightning fast search over it is possible but pretty hard. There's no tutorial that learns you that. I'd recommend Lucene in Action,

Combining Lucene's WildcardQuery with FuzzyQuery

筅森魡賤 提交于 2019-12-02 11:48:18
问题 Using Lucene.Net 2.4.0 is there some kind of built-in support for joining the results of two different queries that target the same index, similar to the support for targeting two or more indexes with a single query? I'm looking for ways to support both trailing wildcard and fuzzy searches without forcing users to choose one or the other. I could achieve this by executing a wildcard query and a fuzzy search sequentially, and then manually merge the two results and sort by the score of the

Lucene.Net fuzzy search speed

不羁的心 提交于 2019-12-02 11:43:20
Sorry for the concern, but I hope to get any help from Lucene-experienced people. Now we use in our application Lucene.Net 3.0.3 to index and search by ~2.500.000 items. Each entity contains 27 searchable field, which added to index in this way: new Field(key, value, Field.Store.YES, Field.Index.ANALYZED)) Now we have two search options: Search only by 4 fields using fuzzy search Search by 4-27 fields using exact search We have a search service that every week automatically searches by about 53000 people such “Bob Huston”, “Sara Conor”, “Sujan Hong Uin Ho”, etc. So we experience slow search

Lucene Search Syntax

房东的猫 提交于 2019-12-02 10:20:24
I need help figuring out which query types to use in given situations. I think i'm right in saying that if i stored the word "FORD" in a lucene Field and i wanted to find an exact match i would use a TermQuery ? But which query type should i use if I was looking for the word "FORD" where the contents of the field where stored as :- "FORD|HONDA|SUZUKI" What if i was to search the contents of an entire page, looking for a phrase? such as "please help me"? If you want to search FORD in FORD|HONDA|SUZUKI , either index with Field.Index.ANALYZED , or store it as below to use TermQuery var analyzer