Parsing a templating language
I'm trying to parse a templating language and I'm having trouble correctly parsing the arbitrary html that can appear between tags. So far what I have is below, any suggestions? An example of a valid input would be {foo}{#bar}blah blah blah{zed}{/bar}{>foo2}{#bar2}This Should Be Parsed as a Buffer.{/bar2} And the grammar is: grammar g; options { language=Java; output=AST; ASTLabelType=CommonTree; } /* LEXER RULES */ tokens { } LD : '{'; RD : '}'; LOOP : '#'; END_LOOP: '/'; PARTIAL : '>'; fragment DIGIT : '0'..'9'; fragment LETTER : ('a'..'z' | 'A'..'Z'); IDENT : (LETTER | '_') (LETTER | '_' |