Are regular expressions used to build parsers?

后端 未结 8 1765
Happy的楠姐
Happy的楠姐 2020-12-16 15:58

This is just a question out of curiosity since I have been needing to get more and more into parsing and using regex lately.. it seems, for questions I come across in my sea

8条回答
  •  执笔经年
    2020-12-16 16:20

    A 'regex' as you know it is a particular notation for creating deterministic finite automata. A DFA is a parsing device, and thus regexps do parse. When you use regexps to match something, you are parsing a string to align it with the pattern. When you use regexps to chop something up into bits with parentheses, you are parsing.

    DFAs are formally defined as parsers for a particular category of languages called 'regular languages' (thanks to Gumbo for reminding me). Many important tasks do not involve regular languages.

    Thus, DFAs are not a good approach to many parsing problems. The most famous examples around here are XML and HTML. There are many reasons, but I'll fill in one. These things are fundamentally tree structures. To parse them, a program has to maintain state as it descends the tree. Regexps don't do that.

    Parsers defined by a grammar (such as LR(k) and LL(k)) do that, top-down hand-coded parsers do that.

    There are books and books on the various alternative parsing technologies that are commonly applied to parsing things like C++, or XML.

提交回复
热议问题