Using ANTLR to parse a log file

早过忘川 提交于 2019-11-30 10:17:12

When you're only interested in a part of the file you're parsing, you don't need a parser and write a grammar for the entire format of the file. Only a lexer-grammar and ANTLR's options{filter=true;} will suffice. That way, you will only grab the tokens you defined in your grammar and ignore the rest of the file.

Here's a quick demo:

lexer grammar TestLexer;

options{filter=true;}

@lexer::members {
  public static void main(String[] args) throws Exception {
    String text = 
        "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function1(selected=[\"red\",\"yellow\"]){}\n"+
        "\n"+
        "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function2(selected=[\"Rocket\"]){}\n"+
        "\n"+
        "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function3(selected=[\"blue\",\"yellow\"]){}\n"+
        "\n"+
        "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function4(selected=[\"Speech\"]){}";
    ANTLRStringStream in = new ANTLRStringStream(text);
    TestLexer lexer = new TestLexer(in);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    for(Object obj : tokens.getTokens()) {
        Token token = (Token)obj;
        System.out.println("> token.getText() = "+token.getText());
    }
  }
}

Input
  :  'Evaluation.Input.Function' '0'..'9'+ Params   
  ;

Output
  :  'Evaluation.Output.Function' '0'..'9'+ Params
  ;

fragment
Params
  :  '(selected=[' String ( ',' String )* '])'
  ;

fragment
String
  :  '"' ( ~'"' )* '"'
  ;

Now do:

javac -cp antlr-3.2.jar TestLexer.java
java -cp .:antlr-3.2.jar TestLexer // or on Windows: java -cp .;antlr-3.2.jar TestLexer

and you'll see the following being printed to the console:

> token.getText() = Evaluation.Input.Function1(selected=["red","yellow"])
> token.getText() = Evaluation.Output.Function2(selected=["Rocket"])
> token.getText() = Evaluation.Input.Function3(selected=["blue","yellow"])
> token.getText() = Evaluation.Output.Function4(selected=["Speech"])
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!