context-free-grammar | 易学教程

Why do on-line parsers seem to stop at regexps?

阅读更多关于 Why do on-line parsers seem to stop at regexps?

问题 I've been wondering for long why there doesn't seem to be any parsers for, say, BNF, that behave like regexps in various libraries. Sure, there's things like ANTLR, Yacc and many others that generate code which, in turn, can parse a CFG, but there doesn't seem to be a library that can do that without the intermediate step. I'm interested in writing a Packrat parser, to boot all those nested-parenthesis-quirks associated with regexps (and, perhaps even more so, for the sport of it), but

Using Parsec to parse regular expressions

阅读更多关于 Using Parsec to parse regular expressions

问题 I'm trying to learn Parsec by implementing a small regular expression parser. In BNF, my grammar looks something like: EXP : EXP * | LIT EXP | LIT I've tried to implement this in Haskell as: expr = try star <|> try litE <|> lit litE = do c <- noneOf "*" rest <- expr return (c : rest) lit = do c <- noneOf "*" return [c] star = do content <- expr char '*' return (content ++ "*") There are some infinite loops here though (e.g. expr -> star -> expr without consuming any tokens) which makes the

Generating n statements from context-free grammars

阅读更多关于 Generating n statements from context-free grammars

问题 So not to reinvent the wheel, I would like to know what has already been done about generating random statements from a context-free language (like those produced by yacc, etc.). These grammars are primarily for parsing, but maybe someone has done some generation for testing the parsers? Thanks 回答1: Check out this blog post. Basically, it randomizes the RHS chosen at each rule application. 回答2: There's an ancient but still interesting article here that shows why you need a few more

Extract probabilities and most likely parse tree from cyk

阅读更多关于 Extract probabilities and most likely parse tree from cyk

问题 In order to understand cyk algorithm I've worked through example on : https://www.youtube.com/watch?v=VTH1k-xiswM&feature=youtu.be . The result of which is : How do I extract the probabilities associated with each parse and extract the most likely parse tree ? 回答1: These are two distinct problems for PCFG: recognition : does the sentence belong to the language generated by the CFG? (output: yes or no) parsing : what is the highest scoring tree for this sentence? (output: parse tree) The CKY

Find a grammar of binary number divisible by 5 with 1 as MSB

阅读更多关于 Find a grammar of binary number divisible by 5 with 1 as MSB

How can I find a grammar of binary number divisible by 5 with 1 as MSB and find the reversal of L So, I need a grammar that generates numbers like.. 5 = 101 10 = 1010 15 = 1111 20 = 10100 25 = 110011 and so on I'm assuming this is homework and you just want a hint. Let's consider a somewhat similar question, but in base 10. How can we write a CFG for numbers divisible by 3. At first glance, this seems unlikely, but it's actually pretty simple. We start with the observation that: 10 k ≅ 1 (mod 3) for any non-negative integer k . Now consider an integer dδ , where d is a decimal digit and δ is a

How do I rewrite a context free grammar so that it is LR(1)?

阅读更多关于 How do I rewrite a context free grammar so that it is LR(1)?

问题 For the given context free grammar: S -> G $ G -> PG | P P -> id : R R -> id R | epsilon How do I rewrite the grammar so that it is LR(1)? The current grammar has shift/reduce conflicts when parsing the input "id : .id", where "." is the input pointer for the parser. This grammar produces the language satisfying the regular expression (id:(id)*)+ 回答1: It's easy enough to produce an LR(1) grammar for the same language. The trick is finding one which has a similar parse tree, or at least from

Left recursion elimination in an LL1 grammar

阅读更多关于 Left recursion elimination in an LL1 grammar

问题 I'm trying to eliminate left recursion from the following extract of a grammar - expression := fragment ( ( + | - | * | / ) fragment )* fragment := identifier | number | ( + | - ) fragment | expression The issue is that expression can go to fragment, can go to expression. I've tried a bunch of ways to eliminate it, some look like they work (in JavaCC) but I'm a)unsure of their correctness, and b) pretty sure I've broken associativity by changing the structure of the grammar. I'm pretty sure I

Is csv format regular grammar or context-free grammar?

阅读更多关于 Is csv format regular grammar or context-free grammar?

问题 I am currently writing a csv parser. The definition of csv format is given by RFC4180 which is defined by ABNF. So the definition of csv is absolutely a contex-free grammar. However, I would like to know if csv is regular grammar? So that I could parse it with just a finite state machine. Furthermore, if it is exactly a regular grammar and can be parsed by finite state machine, does that mean it can be also parsed by regular expression? 回答1: I don't have any formal theory available to verify

Grammar for expressions which disallows outer parentheses

阅读更多关于 Grammar for expressions which disallows outer parentheses

问题 I have the following grammar for expressions involving binary operators (| ^ & << >> + - * /): expression : expression BITWISE_OR xor_expression | xor_expression xor_expression : xor_expression BITWISE_XOR and_expression | and_expression and_expression : and_expression BITWISE_AND shift_expression | shift_expression shift_expression : shift_expression LEFT_SHIFT arith_expression | shift_expression RIGHT_SHIFT arith_expression | arith_expression arith_expression : arith_expression PLUS term |

Free-form text with custom SRGS based Grammar

阅读更多关于 Free-form text with custom SRGS based Grammar

问题 I am trying to develop a Voice based application that would accept user input as speech and perform some actions based on the input. This is my first ever venture into this technology and I am learning while developing it. I am using Microsoft SAPI shipped with dotnet 4 to recognize speech. So far, I have learned about the two types of modes it supports. Speech recognition (SR) has two modes of operation: Dictation mode — an unconstrained, free-form speech interpretation mode that uses a