PHP Lucene - Indexation - Fails in Linux after 2.000.000 system blocks

别来无恙 提交于 2019-12-24 00:18:34

问题


I have been working on creating an index using Zend Framework latest version. The interface is working fine and everything else. The problem I have now is the "re-indexation" or creation of the index. I have checked everything else, sanitizing the data and double checking the quality of the data.

The Process always stops at most likely record 15.000 and the limit on the index dir of 2.000.000. That I decided to build an application compiled in java with version lucene3.0.3 to run the indexation.

Fatal error: Uncaught exception 'Zend_Search_Lucene_Exception' with message 'Unsupported segments file format' in 


It seems the latest format used by Zend Lucene is 2.3
Any ideas how to solve this problem, I really appreciate your input


回答1:


I have no experience with this. But on the zend lucene website they state that the currently supported lucene index version is 2.3. It might be the case that version 3.0.3 is not fully supported.

[1] The currently supported Lucene index file format version is 2.3 (starting from Zend Framework 1.6).

See: http://framework.zend.com/manual/en/zend.search.lucene.java-lucene.html




回答2:


I customized the example of this site http://www.techcrony.info/?p=33, this example reads text files from a data dir. So, the new customized functions need to read the info from the MySQL database:

public static void main(String[] args) throws Exception
{....System.out.print("Index dir arg_0 : " + indexDir + "\r");
    String id ="%";

    long start = new Date().getTime();
    int numIndexed = index_main(indexDir, id);
    long end = new Date().getTime();

    System.out.print("End Program... \r");

}  
private static int index_main(File indexDir, String id )throws IOException {

    int numIndexed = 0;
    try{
        IndexWriter writer =
            new IndexWriter(indexDir, new StandardAnalyzer(), true);
        writer.setUseCompoundFile(false);

      java.sql.Connection conn = linktodata();
      int rowCount = 0;
     ...

As you can see I used the lucene-core-2.3.0.jar

javac -cp .:lucene-core-2.3.0.jar:mysql-connector-java-5.1.16-bin.jar Indexer.java

Run:

java -cp .:lucene-core-2.3.0.jar:mysql-connector-java-5.1.16-bin.jar Indexer /home/public_html/index_main

Now the most important question, is anyone aware if PHP lucene is able to manage more than 1.000.000 documents?



来源:https://stackoverflow.com/questions/6295570/php-lucene-indexation-fails-in-linux-after-2-000-000-system-blocks

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!