How to determine the lucene index version?

◇◆丶佛笑我妖孽 提交于 2021-02-18 14:58:17

问题


I am writing a shell script (csh) that has to determine the lucene index version and then based on that it has to upgrade the index to next version. So, if the lucene indices are on 2.x, I have to upgrade the indices to 3.x Finally the indices need to be upgraded to 6.x.

Since upgrading indices is a sequential process(2.x->3.x->4.x->5.x->6.x), I have to know the indices version before hand so that I can set the classpath properly and upgrade.

Please help me on this.


回答1:


This is not a very clean solution but that is all I am able to find via SegmentInfos.

LuceneVersion --> Which Lucene code Version was used for this commit, written as three vInt: major, minor, bugfix

When you create your IndexReader, it is one of concrete reader classes like - StandardDirectoryReader and this class has a toString() method like below which is printing lucene version for each segment so you can simply call - toString() on IndexReader instance.

@Override public String toString() { final StringBuilder buffer = new StringBuilder(); buffer.append(getClass().getSimpleName()); buffer.append('('); final String segmentsFile = segmentInfos.getSegmentsFileName(); if (segmentsFile != null) { buffer.append(segmentsFile).append(":").append(segmentInfos.getVersion()); } if (writer != null) { buffer.append(":nrt"); } for (final LeafReader r : getSequentialSubReaders()) { buffer.append(' '); buffer.append(r); } buffer.append(')'); return buffer.toString(); }

I guess, a single version for whole index doesn't make sense since an Index might have documents committed from previous version writers too.

Documents committed with older lucene version writers can be searched using latest version readers provided version distance is not far as defined by Lucene.

You might write a simple logic in Core Java using regex to extract highest lucene version as your lucene index version.




回答2:


This is a piece of code I wrote to print the index version.

import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexFormatTooNewException;
import org.apache.lucene.index.IndexFormatTooOldException;
import org.apache.lucene.index.StandardDirectoryReader;
import org.apache.lucene.store.SimpleFSDirectory;
import org.junit.Test;

public class TestReindex {

    public void testVersion() throws IOException{
        Path path = Paths.get("<Path_to_index_files>");

        try (DirectoryReader reader = StandardDirectoryReader.open(new SimpleFSDirectory(path))){
            Pattern pattern = Pattern.compile("lucene.version=(.*?),");

            Matcher matcher = pattern.matcher(reader.toString());
            if (matcher.find()) {
                System.out.println("Current version: " + matcher.group(1));
            }
        } catch(IndexFormatTooOldException ex) {
            System.out.println("Current version: " + ex.getVersion());
            System.out.println("Min Version: " + ex.getMinVersion());
            System.out.println("Max Version: " + ex.getMaxVersion());
        } catch (IndexFormatTooNewException ex) {
            System.out.println("Current version: " + ex.getVersion());
            System.out.println("Min Version: " + ex.getMinVersion());
            System.out.println("Max Version: " + ex.getMaxVersion());
        }
    }
}

If you are trying to read an index that is too new or too old with respect to the version of Lucene being used, an exception will be thrown. The exceptions have info about the version which could be leveraged accordingly.



来源:https://stackoverflow.com/questions/44155910/how-to-determine-the-lucene-index-version

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!