Looking for a clear definition of what a “tokenizer”, “parser” and “lexers” are and how they are related to each other and used?

前端 未结 4 1920
隐瞒了意图╮
隐瞒了意图╮ 2020-11-29 15:12

I am looking for a clear definition of what a \"tokenizer\", \"parser\" and \"lexer\" are and how they are related to each other (e.g., does a parser use a tokenizer or vice

4条回答
  •  -上瘾入骨i
    2020-11-29 15:16

    A tokenizer breaks a stream of text into tokens, usually by looking for whitespace (tabs, spaces, new lines).

    A lexer is basically a tokenizer, but it usually attaches extra context to the tokens -- this token is a number, that token is a string literal, this other token is an equality operator.

    A parser takes the stream of tokens from the lexer and turns it into an abstract syntax tree representing the (usually) program represented by the original text.

    Last I checked, the best book on the subject was "Compilers: Principles, Techniques, and Tools" usually just known as "The Dragon Book".

提交回复
热议问题