lucene.net

Correctly indexing latitude and longitude values in Lucene

只愿长相守 提交于 2019-12-03 08:19:26
Am working on a "US based nearest city search within a given radius" functionality using Lucene API. Am indexing city's lat and long values in Lucene as follows: doc.Add(new Field("latitude", paddedLatitude, Field.Store.YES, Field.Index.UN_TOKENIZED)); doc.Add(new Field("longitude", paddedLongitude, Field.Store.YES, Field.Index.UN_TOKENIZED)); Since Lucene only understands strings and not numbers, am padding lat and long values. For example, if original lat and long are 41.811846 and -87.820628 respectively, after padding,values look like: paddedLatitude -->"0041.811846" and paddedLongitude-->

Lucene IndexWriter slow to add documents

霸气de小男生 提交于 2019-12-03 08:15:16
I wrote a small loop which added 10,000 documents into the IndexWriter and it took for ever to do it. Is there another way to index large volumes of documents? I ask because when this goes live it has to load in 15,000 records. The other question is how do I prevent having to load in all the records again when the web application is restarted? Edit Here is the code i used; for (int t = 0; t < 10000; t++){ doc = new Document(); text = "Value" + t.toString(); doc.Add(new Field("Value", text, Field.Store.YES, Field.Index.TOKENIZED)); iwriter.AddDocument(doc); }; Edit 2 Analyzer analyzer = new

Finding exact match using Lucene search API

夙愿已清 提交于 2019-12-03 07:46:52
I'm working on a company search API using Lucene. My Lucene company index has got 2 companies: 1.Abigail Adams National Bancorp, Inc. 2.National Bancorp If the user types in National Bancorp, then only company # 2(ie. National Bancorp) should be returned and not #1.....ie. only exact matches should be returned. How do I achieve this functionality? Thanks for reading. You can use KeywordAnalyzer to index and search on this field. Keyword Analyzer will generate only one token for the entire string. This is something that may warrant the use of the shingle filter. This filter groups multiple

Lucene Standard Analyzer vs Snowball

梦想与她 提交于 2019-12-03 06:51:28
问题 Just getting started with Lucene.Net. I indexed 100,000 rows using standard analyzer, ran some test queries, and noticed plural queries don't return results if the original term was singular. I understand snowball analyzer adds stemming support, which sounds nice. However, I'm wondering if there are any drawbacks to gong with snowball over standard? Am I losing anything by going with it? Are there any other analyzers out there to consider? 回答1: Yes, by using a stemmer such as Snowball, you

How to make the Lucene QueryParser more forgiving?

感情迁移 提交于 2019-12-03 06:36:42
问题 I'm using Lucene.net, but I am tagging this question for both .NET and Java versions because the API is the same and I'm hoping there are solutions on both platforms. I'm sure other people have addressed this issue, but I haven't been able to find any good discussions or examples. By default, Lucene is very picky about query syntax. For example, I just got the following error: [ParseException: Cannot parse 'hi there!': Encountered "<EOF>" at line 1, column 9. Was expecting one of: "(" ... "*"

Need to know pros and cons of using RAMDirectory

我们两清 提交于 2019-12-03 06:32:50
I need to improve performance of my Lucene search query. Can I use RAMDirectory?Does it optimize performance?Is there any index size limit for this? I would appreciate if someone could list pros and cons of using a RAMDirectory. Thanks. I compare FSDirectory and RAMDirectory. index size is 1.4G Centos, 5G memory Search 1000 keywords, the average/min/max response time (ms) is here FSDirectory first run: 351/7/2611 second run: 47/7/837 third run(restart app): 53/7/2343 RAMDirectory first run: 38/7/1133 second run: 34/7/189 third run(restart app): 38/7/959 So, you can see RAMDirectory is do

How to index Word 2003, 2007 and 2010 documents using Lucene.NET

亡梦爱人 提交于 2019-12-03 03:41:04
I am writing a custom Lucene.NET indexer to enable indexing of MS Word documents. The indexer must be capable of handling last three releases of MS Word: 2010, 2007 and 2003. The plan is to use VSTO interop assemblies that are installed as part of VS2010 to extract text content from the documents. Is there a better way to implement Word document indexing? Does this mean I will have to install all three versions of Word on the server? Or just Word 2010? Tools/Environment: Lucene.NET 2.3.1.3 VS2010 / .NET 3.5 Windows 2008 / IIS 7 Note: For details on how to implement this, see Sitecore text

Are there any books on Lucene.NET [closed]

十年热恋 提交于 2019-12-03 03:27:16
Closed. This question is off-topic. It is not currently accepting answers. Learn more . Want to improve this question? Update the question so it's on-topic for Stack Overflow. I have searched on amazon and could not find a book on lucene.net. Have you guys came across a decent book on lucene.net? You may want to look at: Lucene In Action . Since Lucene.NET is a .NET port of the project, you may find it covers the necessary concepts, even though it's for Java. There should be a 2nd edition of it coming out soon. The book covers: How to integrate Lucene into your applications Ready-to-use

Implement Lucene on Existing .NET / SQL Server stack with multiple webservers

丶灬走出姿态 提交于 2019-12-03 02:11:00
问题 I want to look at using Lucene for a fulltext search solution for a site that I currently manage. The site is built entirely on SQL Server 2008 / C# .NET 4 technologies. The data I'm looking to index is actually quite simple, with only a couple of fields per record and only one of those fields actually searchable. It's not clear to me what the best toolset I need to be using is, or what the architecture I should be using is. Specifically: Where should I put the index? I've seen people

Lucene.Net Best Practices

天涯浪子 提交于 2019-12-03 01:10:32
问题 What are the best practices in using Lucene.Net? or where can I find a good lucene.net usage sample? 回答1: If you're going to work with Lucene, I'd buy a good book that covers it from A to Z. Lucene has a very steep learning curve (in my opinion). It's not only knowing how to search your that's important - it's also about indexing it. Doing a basic search is easy, but creating an index that consists of millions of records of data and still being able to do a lightning fast search over it is