grammar

What makes Java easier to parse than C?

拥有回忆 提交于 2019-12-03 01:46:50
问题 I'm acquainted with the fact that the grammars of C and C++ are context-sensitive, and in particular you need a "lexer hack" in C. On the other hand, I'm under the impression that you can parse Java with only 2 tokens of look-ahead, despite considerable similarity between the two languages. What would you have to change about C to make it more tractable to parse? I ask because all of the examples I've seen of C's context-sensitivity are technically allowable but awfully weird. For example,

Haskell - How to best to represent a programming language's grammar?

僤鯓⒐⒋嵵緔 提交于 2019-12-03 01:32:31
问题 I've been looking at Haskell and I'd quite like to write a compiler in it (as a learning exercise), since a lot of its innate features can be readily applied to a compiler (particularly a recursive descent compiler). What I can't quite get my head around is how to represent a language's grammar in a Haskell-ian way. My first thought was to use recursive data type definitions, but I can't see how I use them to match against keywords in the language ("if") for example. Thoughts and suggestions

Is D's grammar really context-free?

南笙酒味 提交于 2019-12-03 01:02:19
问题 I've posted this on the D newsgroup some months ago, but for some reason, the answer never really convinced me, so I thought I'd ask it here. The grammar of D is apparently context-free. The grammar of C++, however, isn't (even without macros). ( Please read this carefully! ) Now granted, I know nothing (officially) about compilers, lexers, and parsers. All I know is from what I've learned on the web. And here is what (I believe) I have understood regarding context, in not-so-technical lingo:

Grammar Writing Tools [closed]

只谈情不闲聊 提交于 2019-12-02 23:47:59
I am trying to write a grammar in EBNF (barring a really good reason, it has to be EBNF) and am looking for a couple of utilities for it - if there's a GUI interface that can make one, that would be great, but the thing I'm looking for most is something that can check the grammar, for instance to see if it is LALR( n ), and if so, what the value of n is. Do such utilities exist? Are there any other useful grammar-writing tools I should know about (I'm not looking for parser generators). Taking Steven Dee's suggestion one step further, you might want to check out ANTLRWorks , which is an

Python ast to dot graph

随声附和 提交于 2019-12-02 21:10:52
I'm analyzing the AST generated by python code for "fun and profit", and I would like to have something more graphical than "ast.dump" to actually see the AST generated. In theory is already a tree, so it shouldn't be too hard to create a graph, but I don't understand how I could do it. ast.walk seems to walk with a BFS strategy, and the visitX methods I can't really see the parent or I don't seem to find a way to create a graph... It seems like the only way is to write my own DFS walk function, is does it make sense? If you look at ast.NodeVisitor, it's a fairly trivial class. You can either

Some NLP stuff to do with grammar, tagging, stemming, and word sense disambiguation in Python

心不动则不痛 提交于 2019-12-02 21:04:17
Background (TLDR; provided for the sake of completion) Seeking advice on an optimal solution to an odd requirement. I'm a (literature) student in my fourth year of college with only my own guidance in programming. I'm competent enough with Python that I won't have trouble implementing solutions I find (most of the time) and developing upon them, but because of my newbness, I'm seeking advice on the best ways I might tackle this peculiar problem. Already using NLTK, but differently from the examples in the NLTK book. I'm already utilizing a lot of stuff from NLTK, particularly WordNet, so that

How to know if two words have the same base?

若如初见. 提交于 2019-12-02 20:43:55
I want to know, in several languages, if two words are: either the same word, or the grammatical variants of the same word. For example: had and has has the same base: in both cases, it's the verb have , city and cities has the same base. went and gone has the same base. Is there a way to use the Microsoft Word API to not just spell check text, but also normalize a word to a base or, at least, determine if two words have the same base? If not, what are the (free or paid) libraries (not web services) which allow me to do it (again, in several languages)? Inflector.NET is an open source library

Combined unparser/parser generator

会有一股神秘感。 提交于 2019-12-02 20:43:13
Is there a parser generator that also implements the inverse direction, i.e. unparsing domain objects (a.k.a. pretty-printing) from the same grammar specification? As far as I know, ANTLR does not support this. I have implemented a set of Invertible Parser Combinators in Java and Kotlin. A parser is written pretty much in LL-1 style and it provides a parse- and a print-method where the latter provides the pretty printer. You can find the project here: https://github.com/searles/parsing Here is a tutorial: https://github.com/searles/parsing/blob/master/tutorial.md And here is a parser/pretty

W3C CSS grammar, syntax oddities

﹥>﹥吖頭↗ 提交于 2019-12-02 18:53:13
I was having a look at the CSS syntax here and here and I was amazed to see both the token productions and the grammar littered with whitespace declarations. Normally whitespace is defined once in the lexer and skipped, never to be seen again. Ditto comments. I imagine the orientation towards user-agents rather than true compilers is part of the motivation here, and also the requirement to proceed in the face of errors, but it still seems pretty odd. Are real-life UAs that parse CSS really implemented according to this (these) grammars? EDIT: reason for the question is actually the various

How can I incorporate ternary operators into a precedence climbing algorithm?

妖精的绣舞 提交于 2019-12-02 17:36:16
I followed the explanation given in the "Precedence climbing" section on this webpage to implement an arithmetic evaluator using the precedence climbing algorithm with various unary prefix and binary infix operators. I would also like to include ternary operators (namely the ternary conditional operator ?: ). The algorithm given on the webpage uses the following grammar: E --> Exp(0) Exp(p) --> P {B Exp(q)} P --> U Exp(q) | "(" E ")" | v B --> "+" | "-" | "*" |"/" | "^" | "||" | "&&" | "=" U --> "-" How can I incorporate ternary operators into this grammar? To be specific, I'll use C/C++/Java