With Lucene, what would be the recommended approach for locating matches in search results?
More specifically, suppose index documents have a field \"fullText\" whic
TermFreqVector is what I used. Here is a working demo, that prints both the term positions, and the starting and ending term indexes:
public class Search {
public static void main(String[] args) throws IOException, ParseException {
Search s = new Search();
s.doSearch(args[0], args[1]);
}
Search() {
}
public void doSearch(String db, String querystr) throws IOException, ParseException {
// 1. Specify the analyzer for tokenizing text.
// The same analyzer should be used as was used for indexing
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
Directory index = FSDirectory.open(new File(db));
// 2. query
Query q = new QueryParser(Version.LUCENE_CURRENT, "contents", analyzer).parse(querystr);
// 3. search
int hitsPerPage = 10;
IndexSearcher searcher = new IndexSearcher(index, true);
IndexReader reader = IndexReader.open(index, true);
searcher.setDefaultFieldSortScoring(true, false);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
// 4. display term positions, and term indexes
System.out.println("Found " + hits.length + " hits.");
for(int i=0;i