Why my antlr lexer java class is “code too large”?

前端 未结 3 1508
梦如初夏
梦如初夏 2020-12-19 09:17

This is the lexer in Antlr (sorry for a long file):

lexer grammar SqlServerDialectLexer;
/* T-SQL words */
AND: \'AND\';
BIGINT: \'BIGINT\';
BIT: \'BIT\';
CA         


        
3条回答
  •  情歌与酒
    2020-12-19 09:43

    Divide your grammar into several composite grammars. Be careful what you place where. For example, you don't want to place the NAME rule in you top-grammar and keywords into an imported grammar: the NAME would "overwrite" the keywords from being matched.

    This works:

    A.g

    lexer grammar A;
    
    SELECT: 'SELECT';
    SET: 'SET';
    SMALLINT: 'SMALLINT';
    TABLE: 'TABLE';
    THEN: 'THEN';
    TINYINT: 'TINYINT';
    UPDATE: 'UPDATE';
    USE: 'USE';
    VALUES: 'VALUES';
    VARCHAR: 'VARCHAR';
    WHEN: 'WHEN';
    WHERE: 'WHERE';
    
    QUOTED: '\'' ('\'\'' | ~'\'')* '\'';
    
    EQUALS: '=';
    NOT_EQUALS: '!=';
    SEMICOLON: ';';
    COMMA: ',';
    OPEN: '(';
    CLOSE: ')';
    VARIABLE: '@' NAME;
    NAME:
        ( LETTER | '#' | '_' ) ( LETTER | NUMBER | '#' | '_' | '.' )*
        ;
    NUMBER: DIGIT+;
    
    fragment LETTER: 'a'..'z' | 'A'..'Z';
    fragment DIGIT: '0'..'9';
    SPACE
        :
        ( ' ' | '\t' | '\n' | '\r' )+
        { skip(); }
        ;
    

    SqlServerDialectLexer.g

    lexer grammar SqlServerDialectLexer;
    
    import A;
    
    AND: 'AND';
    BIGINT: 'BIGINT';
    BIT: 'BIT';
    CASE: 'CASE';
    CHAR: 'CHAR';
    COUNT: 'COUNT';
    CREATE: 'CREATE';
    CURRENT_TIMESTAMP: 'CURRENT_TIMESTAMP';
    DATETIME: 'DATETIME';
    DECLARE: 'DECLARE';
    ELSE: 'ELSE';
    END: 'END';
    FLOAT: 'FLOAT';
    FROM: 'FROM';
    GO: 'GO';
    IMAGE: 'IMAGE';
    INNER: 'INNER';
    INSERT: 'INSERT';
    INT: 'INT';
    INTO: 'INTO';
    IS: 'IS';
    JOIN: 'JOIN';
    NOT: 'NOT';
    NULL: 'NULL';
    NUMERIC: 'NUMERIC';
    NVARCHAR: 'NVARCHAR';
    ON: 'ON';
    OR: 'OR';
    

    And it compiles fine:

    java -cp antlr-3.3.jar org.antlr.Tool SqlServerDialectLexer.g 
    javac -cp antlr-3.3.jar *.java
    

    As you can see, invoking the org.antlr.Tool on your "top-lexer" is enough: ANTLR automatically generates classes for the imported grammar(s). If you have more grammars to import, do it like this:

    import A, B, C;
    

    EDIT

    Gunther is correct: changing the QUOTED rule is enough. I'll leave my answer though, because when you're going to add more keywords, or add quite a few parser rules (inevitable with SQL grammars), you'll most probably stumble upon the "code too large" error again. In that case, you can use my proposed solution.

    If you're going to accept an answer, please accept Gunther's.

提交回复
热议问题