lucene.net | 易学教程

Lucene.Net and I/O Threading issue

阅读更多关于 Lucene.Net and I/O Threading issue

问题 I have an indexing function named "Execute()" using IndexWriter to index my site's content. It works great if I simply called it from a web page, but failed when I have it as a delegate parameter into System.Threading.Thread. Strangely though, it always work on my local dev machine, it only fails when I uploads to a shared host. This is the error message I got "Lock obtain timed out: SimpleFSLock error...." Below is the failed code (but only fails on a shared host) Scheduler scheduler = new

Lucene.Net and incubation status

阅读更多关于 Lucene.Net and incubation status

问题 I'm evaluating options to make our search more powerful on our .Net website. I need to look into whether we purchase software/hardware such as the Google Search Appliance (GSA) or develop the solution using a framework such as Lucene.Net We're a startup, and the GSA provides a lot of good functionality out of the box, but we would need two boxes, with the second as the backup/dev environment and things start getting expensive..... We have used SQL Server full text in the past, but we're keen

Index replication and Load balancing

阅读更多关于 Index replication and Load balancing

问题 Am using Lucene API in my web portal which is going to have 1000s of concurrent users. Our web server will call Lucene API which will be sitting on an app server.We plan to use 2 app servers for load balancing. Given this, what should be our strategy for replicating lucene indexes on the 2nd app server?any tips please? 回答1: You could use solr, which contains built in replication. This is possibly the best and easiest solution, since it probably would take quite a lot of work to implement your

Ignore special characters in Examine

阅读更多关于 Ignore special characters in Examine

In Umbraco, I use Examine to search in the website but the content is in french. Everything works fine except when I search for "Français" it's not the same result as "Francais". Is there a way to ignore those french characters? I try to find a FrenchAnalyser for Leucene/Examine but did not found anything. I use Fuzzy so it return results even if the words is not the same. Here's the code of my search : public static ISearchResults Search(string searchTerm) { var provider = ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"]; var criteria = provider.CreateSearchCriteria

How to index files such as .txt,.pdf,.doc etc using lucene.net?

阅读更多关于 How to index files such as .txt,.pdf,.doc etc using lucene.net?

问题 I am new to Lucene .net.How to index files such as .txt,.pdf,.doc etc using lucene.net?and what all files we can index using lucene.net? 回答1: Lucene.net is agnostic to indexing particular files. You must index the files yourself. I would use IFilters to pull out the text in a document and then use Lucene.net to create the search index. you can search codeproject.com for multiple articles about using IFilters & lucene.net 回答2: Before you index files you need to extract text from them in a

Search for two-letter words in Lucene

阅读更多关于 Search for two-letter words in Lucene

问题 I'm trying to find documents containing the acronym "IT". I've tried searching using the StandardAnalyzer, SimpleAnalyzer and KeywordAnalyzer - same result (no hits whatsoever). As far as I can see, "it" isn't part of the default stop words? I can find the documents using a wildcard search, so I know they're in the index. Any help is greatly appreciated! Cheers! 回答1: The default stopword set does include the word "it". It is defined in StopAnalyzer , and it is: final List<String> stopWords =

Lucene.NET in medium trust

阅读更多关于 Lucene.NET in medium trust

问题 How do I make Lucene.NET 2.3.2 run in a medium trust environment? GoDaddy doesn't like it the way it is. 回答1: I just recently struggled with this, and wanted to update this with a solution I got to work. I pulled down the latest code and built it myself so I could make changes if needed. In the SupportClass.cs file, starting at line 481 there is some code that verifies a file buffer has been flushed using unmanaged code. if (OS.IsWindows) { if (!FlushFileBuffers(fileStream.Handle)) throw new

Ignore special characters in Examine

阅读更多关于 Ignore special characters in Examine

问题 In Umbraco, I use Examine to search in the website but the content is in french. Everything works fine except when I search for "Français" it's not the same result as "Francais". Is there a way to ignore those french characters? I try to find a FrenchAnalyser for Leucene/Examine but did not found anything. I use Fuzzy so it return results even if the words is not the same. Here's the code of my search : public static ISearchResults Search(string searchTerm) { var provider = ExamineManager

“”black lab*“ ”pet shop“”~5 in Lucene (proximity search with multi-word phrases)

阅读更多关于 “”black lab*“ ”pet shop“”~5 in Lucene (proximity search with multi-word phrases)

问题 How can I do a proximity search for two multi-word phrases in Lucene. For example, I want to find all black lab* (black labrador, black labradoodle, etc) withing 5 words of the phrase "pet shop". Which analyzer should I be using? Which query parser would be recommended? I'm working with Lucene.NET. I've ported the ComplexPhraseQueryParser from Java to C#, but that parser doesn't seem to be doing the trick (or perhaps I'm just using it wrong). I'm just getting started with Lucene, so your help

Components not indexed in sitecore lucene search indexes

阅读更多关于 Components not indexed in sitecore lucene search indexes

问题 I have configured lucene search index in configuration & tested index with lukeall tool it searches for all fields of defined templates but content on pages are using another external component, which is not searched but data in fields of page are searchable. is there any way to search it something like html search so that all data on page could be indexed. Thanks guys. 回答1: It's a common requirement. This screencast outlines an approach where the crawler loops through each of the page's