I’m working with Lucene 2.4.0 and the JVM (JDK 1.6.0_07). I’m consistently receiving OutOfMemoryError: Java heap space, when trying to index large text files.<
You can set the IndexWriter to flush based on memory usage or # of documents - I would suggest setting it to flsuh based on memory and seeing if this fixes your issue. My guess is your entire index is living in memory because you never flush it to disk.