grammar | 易学教程

Priority in grammar using Lark

阅读更多关于 Priority in grammar using Lark

问题 I have a priority problem in my grammar, and I don't have any more idea to fix it. I'm using Lark Here is the thing (I have simplified the problem as much as I can): from lark import Lark parser = Lark(r""" start: set | set_mul set_mul: [nb] set set: [nb] "foo" nb: INT "x" %import common.INT %import common.WS %ignore WS """, start='start') input = "3xfoo" p = parser.parse(input) print(p.pretty()) The output is : start set_mul set nb 3 But what I want is : start set_mul nb 3 set I tried to put

Difference in capturing and non-capturing regex scope in Perl 6 / Raku

阅读更多关于 Difference in capturing and non-capturing regex scope in Perl 6 / Raku

问题 Although the docs state that calling a token/rule/regex as <.foo> instead of <foo> makes them non-capturing, it seems there is a difference in scope, but I'm not sure if it's intended. Here is a simplified test. In a module file: unit module Foo; my token y { y } my token a is export { x <y> } my token b is export { x <.y> } Inside of another script file: grammar A { use Foo; token TOP { <a> } } grammar B { use Foo; token TOP { <b> } } If we calling A.parse("xy") everything runs as expected.

Antlr: Simplest way to recognize dates and numbers?

阅读更多关于 Antlr: Simplest way to recognize dates and numbers?

问题 What is the simplest (shortest, fewest rules, and no warnings) way to parse both valid dates and numbers in the same grammar? My problem is that a lexer rule to match a valid month (1-12) will match any occurrence of 1-12. So if I just want to match a number, I need a parse rule like: number: (MONTH|INT); It only gets more complex when I add lexer rules for day and year. I want a parse rule for date like this: date: month '/' day ( '/' year )? -> ^('DATE' year month day); I don't care if

How can I pass arguments to a Perl 6 grammar?

阅读更多关于 How can I pass arguments to a Perl 6 grammar?

问题 In Edit distance: Ignore start/end, I offered a Perl 6 solution to a fuzzy fuzzy matching problem. I had a grammar like this (although maybe I've improved it after Edit #3): grammar NString { regex n-chars { [<.ignore>* \w]**4 } regex ignore { \s } } The literal 4 itself was the length of the target string in the example. But the next problem might be some other length. So how can I tell the grammar how long I want that match to be? 回答1: Although the docs don't show an example or using the

What's the goto methodolgy of building out a parser in .NET

阅读更多关于 What's the goto methodolgy of building out a parser in .NET

问题 A grammar as well. If one were to approach a generic parser from the ground up how would one go about it? I've looked at ANTLR and Irony, but they are more tools than methodologies. What are the steps one should tackle and the milestones for accomplishment? 回答1: Large topic my friend. If you want to learn about the theory the best place to go is 'the Dragon Book': http://www.amazon.com/Compilers-Principles-Techniques-Tools-Gradiance/dp/0321547985/ref=sr_1_2?s=books&ie=UTF8&qid=1297801900&sr=1

Real-world LR(k > 1) grammars?

阅读更多关于 Real-world LR(k > 1) grammars?

问题 Making artificial LR(k) grammars for k > 1 is easy: Input: A1 B x Input: A2 B y (introduce reduce-reduce conflict for terminal a) A1 : a A2 : a B : b b b ... b (terminal b occurs k-1 times) However, are there any real-world non-LR(1) computer languages that are LR(k > 1)-parsable? Or are non-LR(1) languages also not LR(k) either? 回答1: If a language has an LR(k) grammar, then it has an LR(1) grammar which can be generated mechanically from the LR(k) grammar; furthermore, the original parse

Is it possible to parse big file with ANTLR?

阅读更多关于 Is it possible to parse big file with ANTLR?

问题 Is it possible to instruct ANTLR not to load entire file into memory? Can it apply rules one by one and generate topmost list of nodes sequentially, along with reading file? Also may be it is possible to drop analyzed nodes somehow? 回答1: Yes, you can use: UnbufferedCharStream for your character stream (passed to lexer) UnbufferedTokenStream for your token stream (passed to parser) This token stream implementation doesn't differentiate on token channels, so make sure to use ->skip instead of -

Learning Treetop

阅读更多关于 Learning Treetop

问题 I'm trying to teach myself Ruby's Treetop grammar generator. I am finding that not only is the documentation woefully sparse for the "best" one out there, but that it doesn't seem to work as intuitively as I'd hoped. On a high level, I'd really love a better tutorial than the on-site docs or the video, if there is one. On a lower level, here's a grammar I cannot get to work at all: grammar SimpleTest rule num (float / integer) end rule float ( (( '+' / '-')? plain_digits '.' plain_digits) / (

Finding a grammar is not LL(1) without using classical methods and transforming it to LL(1)

阅读更多关于 Finding a grammar is not LL(1) without using classical methods and transforming it to LL(1)

问题 Let's say i have this grammar: S -> A C x | u B A A -> z A y | S u | ε B -> C x | y B u C -> B w B | w A This grammar is obviously not LL(1), which i can find constructing the parsing table. But is there any way i can prove that this grammar is not LL(1) without using the classical methods i.e. without constructing the parsing table or finding any conflicts? Also how can i convert this grammar to LL(1)? I think i have to use both epsilon-derivation elimination and left recursion elimination

Finding a grammar is not LL(1) without using classical methods and transforming it to LL(1)

阅读更多关于 Finding a grammar is not LL(1) without using classical methods and transforming it to LL(1)