How to multi-thread an ANTLR parser in java

本秂侑毒 提交于 2019-12-13 04:17:19

问题


I have my program that is proving slow in reading a file and then parsing it with antlr grammar. To improve performance of this I would like to multi-thread the parsing?

Read File:

    LogParser pa = new LogParser();
    LogData logrow;
    String inputLine;
    int a=0;
    try {
        //feed line by line
        FileReader fr = new FileReader(jFileChooser1.getSelectedFile());
        BufferedReader reader = new BufferedReader(fr);
        while ((inputLine = reader.readLine()) != null)
        {
            try {
                a++;
                jProgressBar.setValue(a);
                pa.parse(inputLine);  //decode the line
            } catch ... catches errors and send to logger
            } finally {

                logrow=new LogData(pa,a);
                mLogTable.addRow(logrow);//store the decoded line
            }

        } 
        reader.close();
    } catch ... catches errors and send to logger

The code parses the line with pa.parse(inputLine); which sends the input line in to an ANTLRStringStream and then a CharStream and is then parsed. Next logrow=new LogData(pa,a); gets the decoded values which will be storred in my table.

My profiling shows the hotspots being in the lexer and parser classes which are built from my grammar (i.e. LogGrammarLexer and LogGrammarParser). Hope it's clear enough...

Parsing:

LogGrammarLexer lexer = new LogGrammarLexer(inStream);
CommonTokenStream tokens = new CommonTokenStream(lexer);
decoded = new LogGrammarParser(tokens);
try {
    failurePosition="";
    decoded.logLine();
} catch (RecognitionException e) {
    failurePosition=Integer.toString(e.charPositionInLine);
} catch (Exception e) {
  failurePosition="-3";
  throw e;
} finally {
    return decoded;//TODO: see if return is necessary as decoded is now a field in class
}

Also I have been reading up and now know that multi-threading file I/O is useless... Anyway it is the performances inside the built classes for my grammar doing complex decoding/parsing of each line in my file what I need to improve.

So my problem is how to make it multi-threaded,

Cheers


回答1:


If each file line is a separate parse-able entity, then you can have multiple threads as follows:

  • One thread (most likely the main thread) that reads the lines from the file and stores the result in a worker queue
  • a pool of threads, each of which will pull off a file line from the previously mentioned worker queue and parse the data
  • the queue of file lines shared between the threads will have to be synchronized

This will only see an increase in performance if run on a multi-core CPU.

Additionally, this will only work if each file line is a separate parse-able entity as mentioned before. If the concept of an parse-able entity spans multiple lines, or is the entire file then threading will not help. Additionally, if the order of the lines in the file is important, then multi threading may cause issues, since the lines may be parsed out of order.

This is a standard producer/consumer problem, here are some useful links:

  • Java Thread Pools
  • Thread pools and work queues
  • ThreadpoolExecutor programming examples



回答2:


Looks like you could simply split the input file into several and have them imported in parallel threads.



来源:https://stackoverflow.com/questions/11017584/how-to-multi-thread-an-antlr-parser-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!