I\'ve been using the (Java) Highlighter for Lucene (in the Sandbox package) for some time. However, this isn\'t really very accurate when it comes to matching the correct te
You could look into using Solr. http://lucene.apache.org/solr
Solr is a sort of generic search application that uses Lucene and supports highlighting. It's possible that the highlighting in Solr is usable as an API outside of Solr. You could also look at how Solr does it for inspiration.
I've been reading on the subject and came across spanQuery which would return to you the span of the matched term or terms in the field that matched.
There is a new faster highlighter (needs to be patched in but will be part of release 2.9)
https://issues.apache.org/jira/browse/LUCENE-1522
and a back-reference to this question