parser-generator | 易学教程

ANTLR parser hanging at proxy.handshake call

阅读更多关于 ANTLR parser hanging at proxy.handshake call

I am attempting to get a basic ECMAScript parser working, and found a complete ANTLR grammar for ECMAScript 3 , which appears to compile ok and produces the appropriate Lexer/Parser/Walker Java files. (Running inside ANTLR IDE plugin for Eclipse 3.5) However, when actually trying to use it with some simple test code (following guide on ANTLR wiki ), it just hangs when trying to create the parser: CharStream MyChars = new ANTLRFileStream(FileName); // FileName is valid ES3Lexer MyLexer = new ES3Lexer(MyChars); CommonTokenStream MyTokens = new CommonTokenStream(MyLexer); MyTokens.setTokenSource

Multiple flex/bison parsers

阅读更多关于 Multiple flex/bison parsers

What is the best way to handle multiple Flex/Bison parsers inside a project? I wrote a parser and now I need a second one in the same project. So far in the third section of parser1.y I inserted the main(..) method and called yyparse from there. What I want to obtain is having two different parsers ( parser1.y and parser2.y ) and be able to use them from an external function (let's assume main in main.cpp ). Which precautions should I use to export yyparse functions outside .y files and how should I handle two parsers? PS. I'm using g++ to compile but not the C++ versions of Flex and Bison and

ANTLR4: Whitespace handling

阅读更多关于 ANTLR4: Whitespace handling

问题 I have seen many ANTLR grammars that use whitespace handling like this: WS: [ \n\t\r]+ -> skip; // or WS: [ \n\t\r]+ -> channel(HIDDEN); So the whitespaces are thrown away respectively send to the hidden channel. With a grammar like this: grammar Not; start: expression; expression: NOT expression | (TRUE | FALSE); NOT: 'not'; TRUE: 'true'; FALSE: 'false'; WS: [ \n\t\r]+ -> skip; valid inputs are ' not true ' or ' not false ' but also ' nottrue ' which is not a desired result. Changing the

Source of parsers for programming languages?

阅读更多关于 Source of parsers for programming languages?

I'm dusting off an old project of mine which calculates a number of simple metrics about large software projects. One of the metrics is the length of files/classes/methods. Currently my code "guesses" where class/method boundaries are based on a very crude algorithm (traverse the file, maintaining a "current depth" and adjusting it whenever you encounter unquoted brackets; when you return to the level a class or method began on, consider it exited). However, there are many problems with this procedure, and a "simple" way of detecting when your depth has changed is not always effective. To make

Performance of parsers: PEG vs LALR(1) or LL(k)

阅读更多关于 Performance of parsers: PEG vs LALR(1) or LL(k)

问题 I've seen some claims that optimized PEG parsers in general cannot be faster than optimized LALR(1) or LL(k) parsers. (Of course, performance of parsing would depend on a particular grammar.) I'd like to know if there are any specific limitations of PEG parsers, either valid in general or for some subsets of PEG grammars that would make them inferior to LALR(1) or LL(k) performance-wise. In particular, I'm interested in parser generators, but assume that their output can be tweaked for

Are there any LL Parser Generators for Functional Languages such as Haskell or Scala?

阅读更多关于 Are there any LL Parser Generators for Functional Languages such as Haskell or Scala?

问题 I've noticed a distinct lack of LL parsers that create parsers in functional languages. The ideal find for what I've been looking for without success is something to generate a Haskell parser for an ANTLR-style LL(*) grammar (modulo minor reformatting of the grammar), and was surprised that every last parser generator with a functional language target I found was some kind of LR parser. I want to transition the parser of this language I'm working on which has functional features from ANTLR to

ANTLR4: Whitespace handling

阅读更多关于 ANTLR4: Whitespace handling

I have seen many ANTLR grammars that use whitespace handling like this: WS: [ \n\t\r]+ -> skip; // or WS: [ \n\t\r]+ -> channel(HIDDEN); So the whitespaces are thrown away respectively send to the hidden channel. With a grammar like this: grammar Not; start: expression; expression: NOT expression | (TRUE | FALSE); NOT: 'not'; TRUE: 'true'; FALSE: 'false'; WS: [ \n\t\r]+ -> skip; valid inputs are ' not true ' or ' not false ' but also ' nottrue ' which is not a desired result. Changing the grammar to: grammar Not; start: expression; expression: NOT WS+ expression | (TRUE | FALSE); NOT: 'not';

How can I parse code to build a compiler in Java?

阅读更多关于 How can I parse code to build a compiler in Java?

I need to write a compiler. It's homework at the univ. The teacher told us that we can use any API we want to do the parsing of the code, as long as it is a good one. That way we can focus more on the JVM we will generate. So yes, I'll write a compiler in Java to generate Java. Do you know any good API for this? Should I use regex? I normally write my own parsers by hand, though it is not advisable in this scenario. Any help would be appreciated. Regex is good to use in a compiler, but only for recognizing tokens (i.e. no recursive structures). The classic way of writing a compiler is having a

How can we get the Syntax Tree of TypeScript?

阅读更多关于 How can we get the Syntax Tree of TypeScript?

问题 Is there a process on getting a syntax tree of a compiler. We had been assigned on a project that needs to access typescript's syntax tree (which is opensource so we could see the whole compiler's code). But we don't know how to get it. I've been reading some articles in the Internet but I can't really find a user-friendly article or which is written in lehman's term. I believe some mentioned that the first step we need to do is to find the parsing step. But after that we had no idea what to

What is the advantage of using a parser generator like happy as opposed to using parser combinators?

阅读更多关于 What is the advantage of using a parser generator like happy as opposed to using parser combinators?

问题 To learn how to write and parse a context-free grammar I want to choose a tool. For Haskell, there are two big options: Happy, which generates a parser from a grammar description and *Parsec, which allows you to directly code a parser in Haskell. What are the (dis)advantages of either approach? 回答1: External vs internal DSL The parser specification format for Happy is an external DSL, whereas with Parsec you have the full power of Haskell available when defining your parsers. This means that