问题
I have the following code, using Lucene.NET V4, to check if a file exists in my index.
bool exists = false;
IndexReader reader = IndexReader.Open(Lucene.Net.Store.FSDirectory.Open(lucenePath), false);
Term term = new Term("filepath", "\\myFile.PDF");
TermDocs docs = reader.TermDocs(term);
if (docs.Next())
{
exists = true;
}
The file myFile.PDF
definitely exists, but it always comes back as false
. When I look at docs
in debug, its Doc
and Freq
properties state that they "threw an exception of type 'System.NullReferenceException'.
回答1:
First of all, it's a good practice to use the same instance of the IndexReader
if you're not going to consider deleted documents - it's going to perform better and it's thread-safe so you can make a static read-only field out of it (although, I can see that you're specifying false
for readOnly
parameter so in case this is intended, just ignore this paragraph).
As for your case, are you tokenizing filepath
field values? Because if you are (e.g. by using StandardAnalyzer
when indexing/searching), you will probably have problems finding values such as \myFile.PDF
(with default tokenizer, the value is going to be split into myFile
and PDF
, not sure about the leading backslash).
Hope this helps.
回答2:
You may have analyzed the field "filepath" during indexing with an analyzer which tokenizes/changes the content. e.g. the StandardAnalyzer tokenizes, lowercases, removes stopwords if specified etc.
If you only need to query with the exact filepath like in your example use the KeywordAnalyzer during indexing for this field.
If you can't re-index at the moment you need to find out which analyzer is used during indexing and use it to create your query. You have two options:
- Use a query parser with the right analyzer and parse the query
filepath:\\myFile.PDF
. If the resultung query is a TermQuery you can use its term as you did in your example. Otherwise perform a search with the query. - Use the Analyzer directly to create the terms from the TokenStream object. Again, if only one term, do it as you did, if multipe terms, create a phrase query.
来源:https://stackoverflow.com/questions/20993676/lucene-net-checking-if-document-exists-in-index