CFG for python-style tuples

泄露秘密 提交于 2019-12-06 13:13:58

Without actually referring to the Python grammar, I'm pretty sure that your grammar produces all valid Python tuples except one ((), the empty tuple), and that it doesn't produce anything which is not a Python tuple. So to that extent, it's fine.

However, it's not much use for parsing because

TupleItem → TupleItem TupleItem

is exponentially ambiguous. (Dicho sea de paso, TupleItem is not a very descriptive name for this non-terminal, which is really a list.) Ambiguous grammars are "proper" in the sense that they obey all the rules for context-free grammars, but unambiguous grammars are usually better.

It's easy to fix:

Tuple → “(“ “)”
Tuple → “(“ ItemList “,” “)”
Tuple → “(“ ItemList “,” Item “)”
ItemList → Item
ItemList → ItemList “,” Item
Item → Id
Item → Tuple

(I left out the Id productions; in practical grammars, Id would be a terminal, but it makes little difference.)

Finally, why is this grammar "so long"? (Is seven productions really "so freaking long?"? Depends on your criteria, I guess.)

The simple answer is that CFGs are like that. You could add syntactic sugar to make the right-hand sides regular expressions (not just alternation, but also Kleene star and its companions):

Tuple → “(“ [ ItemList “,” Item? ]? “)”
ItemList → Item // “,”
Item → Id | Tuple

Here I use the useful interpolate operator //, which is rarely taught in academic classes and consequently has surprisingly few implementations:

a // b =def a(ba)*

Whether or not the above is easier to read, I leave to the reader. It's similar to the EBNF (Extended Backus-Naur Form) commonly used in grammar expositions, particularly in RFCs. (EBNF is one of the few formalisms with an interpolate operator, although its not written as explicitly as mine.)

Anyway, other than that, I don't believe that your grammar can be trimmed.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!